AI Search Glossary

Simple definitions for AI search, GEO, citations, and the metrics that matter.

A

AI Confidence

AI Confidence is a score (from 0 to 1) that reflects how reliably LLM search engines can describe your brand, product, and positioning when they include you in an answer.

AI Overviews

AI Overviews are AI-generated summary answers shown directly on a search results page for some queries.

AI search

AI search is an answers-first search experience where an AI model generates a direct response, often using information retrieved from the web or a search index.

AI Search Visibility

AI Search Visibility is how often your brand or content appears in AI answers for a defined set of prompts, measured as mentions plus citations.

AI SEO

AI SEO is a casual term for SEO work aimed at improving visibility in AI-generated answers, not just classic rankings.

Alt text

Alt text (alternative text) is a short description added to an image's HTML so screen readers and search systems can understand what the image shows.

Answer-first search

Answer-first search is a search experience where the system presents a synthesized answer immediately, instead of making a list of links the main result.

Assisted Conversion

An assisted conversion is a conversion where AI was part of the journey but not necessarily the last click.

B

Bing index

The Bing index is Bing's catalog of discovered and stored URLs used to return results and power retrieval for downstream experiences.

Browsing

Browsing is when an AI system accesses the live web (directly or through an index) to retrieve sources at query time.

C

Canonical URL

A canonical URL is the preferred version of a page's address, declared via a link tag, that tells search engines which URL to index when duplicates exist.

Citation

A citation is a source reference (often a clickable link) included inside an AI-generated answer to show where information came from.

Citation Depth

Citation Depth (or URL Citation Depth) measures how many citations go to specific pages on your site versus your homepage.

Citation Frequency

Citation Frequency is the percentage of tracked prompts where your domain is cited in AI answers.

Citation optimization

Citation optimization is increasing the likelihood that AI answers reference your page as a source.

Citation Quality

Citation Quality is a segmentation of citations by intent and value, distinguishing definition prompts from high-intent evaluation prompts.

Citation Share of Voice (C-SoV)

Citation Share of Voice is your citation share vs total citations across a defined competitor set, measured on a fixed prompt set.

Citations

Citations are links or source references shown inside an AI answer to indicate where information came from.

Comparison pages

Comparison pages are dedicated pages that compare two or more options (e.g. "X vs Y") with differences, pros/cons, and recommendations.

Content chunking

Content chunking is structuring information into small, self-contained sections that can be retrieved, summarized, and cited independently.

Core Web Vitals

Core Web Vitals are a set of three metrics (LCP, CLS, INP) defined by Google that measure real-world page load performance, visual stability, and interactivity.

D

Dark AI Traffic

Dark AI Traffic refers to visits influenced by AI answers that arrive without an identifiable referrer, often labeled Direct in analytics.

E

E-E-A-T

E-E-A-T stands for Experience, Expertise, Authoritativeness, and Trustworthiness. It is a framework Google uses to evaluate content quality and credibility.

Entities

In search and NLP, entities are distinct, identifiable things (people, places, products, concepts) that systems use to understand what content is about.

Extractability

Extractability is how easy it is for a system to pull the right answer from a page without confusion.

F

FAQPage schema

FAQPage schema is structured data that marks up a page's FAQ section (questions and answers) so machines can understand it as Q&A.

G

GA4 Channel Group

A GA4 Channel Group is a Google Analytics feature to classify traffic sources, useful for grouping AI referrals into one reportable bucket.

Grounding

Grounding is tying an AI-generated answer to specific sources so claims can be traced back to evidence.

H

Hallucination

A hallucination is when an AI model produces information that sounds plausible but is incorrect or not supported by sources.

Heading hierarchy

Heading hierarchy is the ordered use of H1 through H6 tags to structure a page's content into a logical outline.

HTTP status code

An HTTP status code is a three-digit number returned by a server to indicate whether a request succeeded, redirected, or failed.

HTTPS

HTTPS (HyperText Transfer Protocol Secure) encrypts the connection between a browser and a server using TLS, protecting data in transit.

I

IndexNow

IndexNow is a protocol that lets you notify participating search engines when URLs change so they can crawl and update them faster.

J

JSON-LD

JSON-LD is a format for adding structured data to a page in a machine-readable way (commonly used for Schema.org).

L

lang attribute

The lang attribute is an HTML attribute on the <html> tag that declares the primary language of the page (e.g. lang="en").

LCP (Largest Contentful Paint)

LCP is a Core Web Vitals metric that measures how long it takes for the largest visible element (image, heading, or text block) to render on screen.

LLM (Large Language Model)

A Large Language Model (LLM) is an AI model trained on vast amounts of text data that can generate human-like text responses.

llms.txt

llms.txt is a proposed convention for publishing a machine-friendly page that directs AI agents to a site's most important content.

M

Mention

A mention is when an AI answer names a brand, product, or entity without linking to it as a source.

Meta robots

Meta robots is an HTML meta tag that gives search engines page-level crawling and indexing instructions (e.g. noindex, nofollow).

Misattribution Rate

Misattribution Rate is how often an AI answer attributes incorrect claims to your brand or content, or confuses you with another vendor.

Mixed content

Mixed content occurs when an HTTPS page loads sub-resources (images, scripts, stylesheets) over insecure HTTP.

N

Noindex

Noindex is a directive (via meta tag or HTTP header) that tells search engines not to include a page in their index.

O

OAI-SearchBot

OAI-SearchBot is OpenAI's crawler used to surface websites in ChatGPT's search features.

P

Paywall

A paywall is an access restriction that blocks visitors from reading page content until they subscribe or pay.

Prompt Set

A prompt set is a fixed list of queries used to benchmark AI search visibility over time.

Prompt Taxonomy

A prompt taxonomy is a classification of prompts by topic cluster, intent type, persona, and funnel stage.

R

Redirect chain

A redirect chain is a sequence of two or more redirects between the original URL and the final destination (e.g. A -> B -> C).

Retrieval

Retrieval is the step where an AI search system selects and pulls relevant documents from the web or a search index before generating an answer.

robots.txt

robots.txt is a site-wide rules file that tells crawlers which paths they are allowed or not allowed to fetch.

S

Schema markup

Schema markup is structured data (usually in JSON-LD format) added to a page to help search engines understand the content type, entities, and relationships.

Search intent

Search intent is the underlying goal behind a user's query (informational, navigational, transactional, or commercial investigation).

Server-side rendering

Server-side rendering (SSR) is a technique where the server generates the full HTML for a page before sending it to the browser, rather than relying on client-side JavaScript.

Share of Voice

Share of Voice measures your visibility relative to competitors across a set of tracked prompts.

Source selection

Source selection is how an AI search system decides which documents to retrieve and which ones to cite in its answer.

T

Title tag

The title tag is an HTML element (<title>) that defines the page's title shown in browser tabs, search results, and social shares.

Training data

Training data is the information used to train an AI model before it is deployed, forming its general knowledge.