Which Sources Each AI Engine Actually Cites

By , Co-founder, GeoLinks · · 4 min read
Five AI search engine logos arranged around a central content page, each pulling from a different source, editorial diagram style
Five AI search engine logos arranged around a central content page, each pulling from a different source, editorial diagram style

There is no single “AI search”. ChatGPT, Claude, Gemini, Perplexity and Google AI Overviews each pull from different places, so a page one engine cites can be invisible to another. This post maps where each engine actually gets its answers, what that means for where you publish, and a per-engine priority order by industry. The patterns are drawn from GeoLinks citation tracking across client campaigns.

The market context matters here. ChatGPT’s share of AI search referrals fell from 89.1% to 62.6% in under a year. The full shift, and why it changes strategy, is in the multi-engine breakdown. This post goes one level deeper: not the share each engine holds, but the sources each one trusts.

Why “AI SEO” is the wrong unit of work

Treating AI search as one target is the most common mistake we see. The engines do not share a retrieval layer. They share some habits, but each one reaches for a different set of sources first.

That has a direct consequence. The off-site work that earns a ChatGPT citation is not the same work that earns a Perplexity one. If you brief an agency on “AI SEO” and they hand back a single generic plan, they are optimising for an engine that does not exist.

The fix is to split the work into a shared layer and an engine-specific layer. Around 80% is shared. The last 20% decides which engines actually cite you.

ChatGPT: a Bing-flavoured index, Wikipedia and Reddit

ChatGPT’s search retrieval leans on a Bing-flavoured index, then weights Wikipedia and Reddit heavily on top. A clear Wikipedia-style entity presence and active, relevant Reddit discussion both raise your odds.

The practical move is to make your brand legible as an entity first, then earn corroboration on the sources ChatGPT samples. FAQ schema helps here too: it carries around 40% of ChatGPT source-selection weight. The full nine-lever playbook is in how to get cited by ChatGPT.

Claude: Brave’s index and a careful crawl

Claude pulls from the Brave search index plus Anthropic’s own crawl. It is the fastest-growing engine, up from 1.4% to 18.5% of AI search referrals in under a year, so ignoring it is no longer safe.

Claude rewards clean, well-sourced pages and tends to be conservative about which sources it trusts. Entity consistency and credible third-party citations matter more here than raw link volume. If your name, category and location do not match across the web, Claude is the engine most likely to skip you.

Gemini: the Google index and the Knowledge Graph

Gemini uses the Google index and the Knowledge Graph. That makes it the engine where classic SEO equity transfers most directly. A strong Knowledge Graph entry, consistent structured data, and good organic rankings all feed Gemini.

If you already rank well on Google, Gemini is often your quickest multi-engine win. The work is less about new placements and more about tightening entity signals and structured data the Knowledge Graph reads.

Google AI Overviews: your top results, reframed

AI Overviews draw heavily from the top of Google’s own organic results, then summarise them. Strong classic SEO still feeds this surface directly, which is why it rewards depth.

Multimodal pages do best. May 2026 research put text-plus-media pages at a 0.92 correlation with AI Overview selection, the highest single factor measured that year. The mechanism, and how to build for it, is in the multimodal content study.

Perplexity: a fast crawler, Reddit and YouTube

Perplexity runs its own crawler and can index fresh content within 48 hours. It skews towards Reddit and YouTube, and it favours recent, well-structured pages. That speed makes it the easiest engine to win quickly.

One GeoLinks client saw Perplexity citations and 1,000 impressions inside 48 hours of publishing a single structured page. The full sequence is in the Perplexity citation case study.

The per-engine source map

EnginePrimary indexTrusts mostSpeed to citeBest lever
ChatGPTBing-flavouredWikipedia, RedditWeeksEntity clarity, FAQ schema
ClaudeBrave + own crawlCorroborated sourcesWeeksEntity consistency, citations
GeminiGoogle + Knowledge GraphStructured dataWeeksKnowledge Graph, classic SEO
AI OverviewsGoogle top resultsMultimodal depthDays to weeksMultimodal content, rankings
PerplexityOwn fast crawlerReddit, YouTube, fresh pagesUnder 48 hoursFreshness, structure

Where to start, by industry

Pick your engines by where your buyers ask. B2B SaaS audiences lean on ChatGPT and Perplexity, so prioritise entity clarity, Reddit corroboration and fresh structured pages. Consumer and local audiences see more Gemini and AI Overviews, so lead with structured data, Knowledge Graph signals and multimodal pages.

In every case, do the shared 80% first. Open your site to OAI-SearchBot, ClaudeBot, GPTBot and PerplexityBot. Make your entity consistent everywhere. Earn third-party corroboration, because around 91% of citations come from sources you do not own. Then tune the last 20% per engine. That sequencing is the core of the Citation Floor Method, and the off-site placements that feed it run through our guest posts service.

Not sure which engines already cite you? The free AI Visibility Check scans all five and shows your citation share per engine in under five minutes. When you want a plan built around your engine mix, our pricing lists every tier with no gated quote.