Back to App

Research Citations

The research underpinning LLM Optimizer's analysis methodology. All scoring frameworks, dimension weights, and recommendations are derived from peer-reviewed academic work and validated practitioner research.

Research Digest

Brand Recognition vs. Discovery. A key framework throughout LLM Optimizer is the distinction between brand recognition — how well AI represents your brand when people search for it by name — and inbound discovery — how often AI surfaces your brand when people search your category without prior knowledge of you. Both matter, but they require different strategies. Brand recognition improves through authority signals, earned media, and training data presence. Discovery requires appearing in category-level content, answering the questions your audience asks before they know you exist, and being present in the YouTube videos, Reddit threads, and web pages that LLMs cite for category queries.

The emerging science of LLM visibility reveals a fundamental shift in how information gains authority online. The most significant recent finding comes from NanoKnow (2026), which demonstrates that content appearing frequently in training data more than doubles a model's accuracy on related questions — and that the advantage compounds when content is both memorized during training and retrievable at inference time. This means the traditional SEO playbook of optimizing for a single ranking algorithm is being replaced by a dual imperative: getting into training corpora through widespread, high-quality publication, while simultaneously remaining citable through structured, authoritative web presence.

Across the research, a consistent pattern emerges: AI search engines overwhelmingly favor earned media over brand-owned content, citing third-party sources 72-92% of the time. Content that includes quotations from authoritative sources gains +41% visibility — the single most effective optimization technique identified. Meanwhile, YouTube has rapidly become the dominant social citation source for LLMs, with its share doubling to 39% between August and December 2024. Critically, video LLMs process content through transcripts, not visual analysis — a 7B model trained on YouTube transcripts outperformed 72B models, proving that transcript quality matters far more than production value.

Reddit has emerged as the #2 social citation source for LLMs, with unique authority dynamics. Reddit was foundational in LLM training through datasets like WebText and the Common Crawl, and continues through $60M (Google) and $70M (OpenAI) annual licensing deals. Unlike YouTube's channel-centric authority, Reddit's influence comes from multi-user validation — upvoted comment consensus, especially in "best X for Y" recommendation threads, creates credibility signals that LLMs weight heavily. The Toronto GEO paper classifies Reddit as "Social" — a category AI search engines suppress in direct citations — yet Reddit's pervasive presence in training data means it heavily shapes baseline model knowledge even when not explicitly cited.

A critical "two-world" split has emerged between Google AI Overviews and standalone LLMs. 76% of AI Overview citations pull from top-10 organic pages — making traditional search rankings the primary signal for AIO inclusion. But for standalone LLMs like ChatGPT, only 12% of cited URLs rank in Google's top 10. The strongest predictor of AI citation across platforms is YouTube mentions (0.737 correlation), followed by web mentions (0.664) — not backlinks. Meanwhile, content freshness has become a significant signal: AI assistants cite content that is 25.7% newer than traditional search results, and 65% of AI bot crawl hits target content less than a year old. The explosive growth of AI crawlers (GPTBot up 305% YoY) makes robots.txt policy a direct lever for AI visibility.

However, this new landscape comes with important caveats. Citation accuracy across AI answer engines remains surprisingly poor (49-68%), with nearly a third of claims lacking any source backing. Citation concentration follows power-law dynamics, where the top 20 sources capture 28-67% of all citations. And LLMs exhibit strong positional bias, reliably attending to content at the beginning and end of context while ignoring the middle.

Compounding these challenges, model updates can sharply reduce citation volume. When GPT-5.3 replaced GPT-4o as ChatGPT's default, unique domains cited per response dropped 20.5% overnight — meaning brands that had achieved dynamic visibility through real-time retrieval lost it without any change on their end. This volatility reinforces the importance of parametric visibility (being embedded in training data) alongside dynamic visibility (being citable at inference time). Research into LLM parametric memory reveals that network centrality — being densely associated with high-authority brands in a model's knowledge graph — outweighs raw mention frequency. A brand that appears alongside category leaders in training data gains disproportionate visibility, even if it is mentioned less often overall. Together, these findings inform LLM Optimizer's scoring frameworks across answer optimization, video authority, Reddit authority, and search visibility analysis.

Source Papers

Lost in the Middle: How Language Models Use Long Contexts
TACL 2024
Position bias in LLM context windows — U-shaped attention curve where content at the beginning and end is reliably used while middle content is ignored.
GEO: Generative Engine Optimization
Princeton / KDD 2024
Tested 9 content optimization strategies on 10,000 queries. Quotations (+41%), statistics (+33%), and fluency (+29%) are the most effective methods for improving LLM citation visibility.
NanoKnow: Probing LLM Knowledge by Linking Training Data to Answers
2026
Training data frequency more than doubles model accuracy. Even with oracle RAG, models score ~11 points higher on questions with answers in training data.
GEO: How to Dominate AI Search — Source Preferences
U of Toronto 2025
AI search engines cite earned media 72-92% of the time vs. 18-27% for brand-owned content. AI citations overlap with Google results only 15-50%.
YouTube vs Reddit AI Citations
Adweek / Bluefish / Emberos / Goodie AI, 2025
YouTube appears in 16% of LLM answers (vs. 10% for Reddit). YouTube's social citation share doubled from 18.9% to 39.2% between Aug-Dec 2024.
News Source Citing Patterns in AI Search Systems
2025
Citation concentration and gatekeeping dynamics across 366K citations. Top 20 sources capture 28-67% of all citations (Gini 0.69-0.83).
LiveCC: Learning Video LLM with Streaming Speech Transcription
CVPR 2025
How video LLMs are trained from ASR transcripts. A 7B model trained on YouTube transcripts surpassed 72B models, proving transcript quality matters more than model size.
The False Promise of Factual and Verifiable Source-Cited Responses
2024
Citation accuracy ranges 49-68% across answer engines. 23-32% of claims have no source backing. Perplexity generates one-sided answers 83.4% of the time.
Language Models are Unsupervised Multitask Learners
OpenAI, 2019 (Radford et al.)
Introduced WebText, a dataset of 8 million Reddit posts with 3+ karma score, as the foundational training corpus for GPT-2. Demonstrated that Reddit's community curation mechanism (karma voting) effectively serves as a quality filter for large-scale language model training data.
Consent in Crisis: The Rapid Decline of the AI Data Commons
ACM FAccT 2024 (Longpre et al.)
Comprehensive audit of AI training data sources documenting Reddit's persistent prominence in Common Crawl and other web corpora. Found that robots.txt restrictions increased 25%+ from 2023-2024 as sites restricted AI crawling, while Reddit data remained broadly available through licensing agreements.
Reddit Data Licensing: Google and OpenAI Deals
Reuters / The Verge, 2024
Google pays $60M/year and OpenAI $70M/year for Reddit data access. Reddit's API was locked down in 2023. Active litigation: Reddit v. Anthropic, Reddit v. Perplexity (scraping claims).
Community Consensus as LLM Authority Signal
Bluefish Labs / Emberos Research, 2025
Reddit's multi-user validation (upvotes, comment consensus) creates credibility signals single-author content cannot match. "Best X for Y" recommendation threads are among the most influential for LLM comparison queries.
AI Overview Citations and Search Rankings
Ahrefs, 2025
76% of AI Overview citations pull from top-10 organic pages. Median organic ranking for a cited URL is position 3. 86% of citations come from within the top 100 organic results.
AI Search Overlap: How AI Citations Differ from Google
Ahrefs, 2025
Only 12% of standalone LLM citations overlap with Google's top 10. Perplexity shows 28.6% overlap. 80%+ of ChatGPT/Claude/Gemini citations come from pages not ranking in Google at all.
AI Brand Visibility Correlations (75K Brands)
Ahrefs, 2025
YouTube mentions (0.737) and web mentions (0.664) are the strongest correlators with AI visibility. Brand search volume (0.334) outperforms backlinks (0.37). Top 25% brands get 12x more AIO mentions.
Do AI Assistants Prefer to Cite Fresh Content?
Ahrefs, 2025 (17M citations)
AI assistants cite content 25.7% newer than traditional search. ChatGPT: avg 1,023 days old. Perplexity pulls ~50% from current year. Google AIOs counter-trend: prefer older authoritative content.
AI Brand Visibility and Content Recency
Seer Interactive, 2025
65% of AI bot crawl hits target content published within the past year. 85% of AIO citations from last 2 years. 94% from last 5 years.
Do Large Language Models Favor Recent Content?
arXiv, September 2025
LLMs consistently promote "fresh" passages. Top-10 mean publication year shifts forward by up to 4.78 years. Individual items move up to 95 ranking positions based on recency signals alone.
From Googlebot to GPTBot: Who's Crawling Your Site
Cloudflare, 2025
GPTBot grew 305% YoY. OpenAI crawl-to-referral ratio: 1,700:1. Anthropic: 73,000:1. ~21% of top-1000 sites block GPTBot. Training crawls = 80% of AI bot activity.
AI Overviews Study: 200,000 Keywords
Semrush, 2025
Reddit (40.1%) and Wikipedia (26.3%) dominate AIO citations. 80% of AIO responses target informational queries. 82% appear for keywords with <1,000 monthly searches.
ChatGPT Search Visibility: GPT-5.3/5.4 Citation Analysis
Resoneo, 2026
27,000 responses across 400 prompts over 14 weeks. After GPT-5.3 launched, unique domains cited per response dropped 20.5% (19.1 → 15.2) and unique URLs dropped 21.0% (24.1 → 19.1). Formalizes the distinction between parametric visibility (training data knowledge) and dynamic visibility (real-time web retrieval).
Brand Authority Index: Network Centrality in LLM Parametric Memory
Dejan AI / Resoneo, 2026
Queried Gemini 200,000 times across ~20 million brand mentions, building a 2.9 million-node directed association graph. Found that network centrality — being densely associated with high-authority brands — outweighs raw mention frequency for parametric visibility. A brand with zero spontaneous recall ranked highest due to dense intersections with authority brands.

Answer Optimization Scoring Framework

Each optimization report scores how likely an LLM is to surface and cite a website's answer across four research-backed dimensions.

Content Authority30%
Source: GEO (Princeton/KDD 2024)
Measures the presence of quotations from authoritative sources (+41% visibility), statistical evidence (+33%), source citations (+28%), fluency (+29%), and technical terminology (+19%). Penalizes keyword stuffing (-9%).
Structural Optimization20%
Source: Lost in the Middle (TACL 2024) + GEO (Toronto 2025)
Evaluates answer prominence (front-loaded vs. buried), content conciseness, machine-readable structure (Schema.org, tables, comparison formats), and justification language that explains "why" rather than just "what."
Source Authority30%
Source: GEO (Toronto 2025)
Assesses third-party coverage and earned media presence. AI search engines cite earned media 72-92% of the time. Evaluates cross-engine consistency since different AI providers cite substantially different sources (similarity only 0.11-0.58).
Knowledge Persistence20%
Source: NanoKnow (2026)
Measures how deeply information is embedded in model training data. Answer frequency more than doubles accuracy. Content that is both in training data AND retrievable at inference compounds advantage by ~11 percentage points. Clear, educational writing outperforms natural text by 19+ points.

Video Authority Scoring Framework

Video analysis evaluates YouTube presence across four pillars, grounded in the finding that LLMs process video through transcripts, not visual content.

Transcript Authority30%
Source: LiveCC (CVPR 2025) + GEO (Princeton 2024)
Transcript quality is the dominant signal for LLM visibility. Evaluates keyword alignment, quotability (standalone citable statements get +41% visibility per GEO), information density, and caption availability. Videos without captions are effectively invisible to LLMs.
Topical Dominance25%
Source: AI Search Arena (2025) + GEO (Toronto 2025)
Measures topic coverage breadth and depth, share of voice across video content in the space, content gaps representing first-mover opportunities, and coverage depth (surface vs. in-depth treatment). Winner-take-all dynamics mean being first in a topic gap has outsized value.
Citation Network25%
Source: AI Search Arena (2025) + YouTube Citation Analysis (Adweek 2025)
Analyzes who mentions the brand, their authority level, and concentration risk. Top 20 sources capture 28-67% of all AI citations. A mention by a high-authority channel outweighs dozens of small-channel mentions. Human engagement metrics (views, subscribers) do not predict AI citation.
Brand Narrative Quality20%
Source: False Promise of Source-Cited Responses (2024) + Lost in the Middle (2024)
Evaluates sentiment, mention context and position (early mentions get priority per U-shaped attention), extractability (clear mentions are less likely to be misrepresented given 49-68% citation accuracy), and narrative coherence. Includes a confidence discount reflecting known citation inaccuracy rates.

Reddit Authority Scoring Framework

Reddit analysis evaluates community discussion across four pillars, grounded in Reddit's unique role as a multi-user validation platform for LLM training data.

Presence25%
Source: Reddit Training Data Analysis (2024-2025) + GEO (Toronto 2025)
Volume and breadth of brand mentions across relevant subreddits. Measures total mentions, unique subreddits reached, and mention trend over time. High presence in topic-specific subreddits carries more weight than general discussion.
Sentiment & Recommendations25%
Source: Community Consensus Research (Bluefish/Emberos 2025)
Community tone and recommendation strength. Evaluates positive/negative sentiment balance, recommendation rate in "best X for Y" threads, and the specific praise/criticism themes that shape LLM perception.
Competitive Positioning25%
Source: GEO (Toronto 2025) + Reddit Community Analysis
Head-to-head positioning against competitors in comparison threads. Measures win rate, cited differentiators, and competitor advantages not countered — these directly shape LLM comparison responses.
Training Signal Strength25%
Source: NanoKnow (2026) + Reddit Data Licensing (2024)
Likelihood that Reddit discussions will influence LLM training. High-upvote threads in authoritative subreddits with deep comment engagement create the strongest training signals. Reddit data is actively licensed to OpenAI and Google.

Search Visibility Scoring Framework

Search visibility analysis evaluates how search-related signals affect whether AI systems will discover, index, and cite your content — bridging traditional SEO signals with AI citation dynamics. When Brand Intelligence provides category data, a fifth pillar (Category Discovery) measures whether people searching your category — without knowing your brand — can find you.

AI Overview Readiness30%
Source: Ahrefs AIO Citations Study (2025) + Semrush AIO Study (2025)
76% of AI Overview citations pull from top-10 organic pages. Evaluates organic ranking presence, structured data (Schema.org, JSON-LD), content format alignment with AIO-preferred informational queries, and answer prominence (front-loaded concise answers). AIOs favor long-tail keywords — 82% appear for terms with <1,000 monthly searches.
Crawl Accessibility20%
Source: Cloudflare AI Crawler Report (2025) + Consent in Crisis (ACM FAccT 2024)
GPTBot grew 305% YoY with a crawl-to-referral ratio of 1,700:1. Evaluates robots.txt policy for AI crawlers (GPTBot, ClaudeBot, PerplexityBot and their SearchBot variants), sitemap completeness, and render accessibility. Blocking training bots while allowing search bots is a valid strategy; blocking everything eliminates AI visibility.
Brand Search Momentum25%
Source: Ahrefs 75K-Brand Study (2025) + Google Trends API (2025)
Brand search volume has a 0.334 correlation with AI citation frequency — but web mentions (0.664) and YouTube mentions (0.737) are stronger. Winner-takes-all: top 25% brands average 169 AIO mentions vs. 14 for the 50th-75th percentile. Evaluates brand search trends, entity recognition, and competitive positioning.
Content Freshness25%
Source: Ahrefs 17M Citations Study (2025) + Seer Interactive (2025) + arXiv Recency Bias (2025)
AI assistants cite content 25.7% newer than traditional search. 65% of AI bot hits target content <1 year old. Freshness signals can move items up to 95 ranking positions in LLM reranking. Evaluates content age, update frequency, freshness signals (dates, last-modified), and content decay risk. Note: Google AIOs counter-trend, preferring older authoritative content.
Category Discovery20% (when categories available)
Source: Brand Intelligence categories + target queries
When Brand Intelligence provides category keywords and intent queries, a fifth pillar evaluates how visible the brand is in category-level searches — queries where users search the category without knowing the brand. This measures discovery potential: does the brand appear when someone searches "best [category] tools" or "[use case] solutions"? Sub-metrics include category visibility, intent coverage, competitor gap, and discovery potential. Weights rebalance to 25/15/20/20/20 across all five pillars.

Key Research Findings

Quotations are the single most effective optimization method
Adding quotes from authoritative sources improves LLM visibility by 41%, more than any other technique tested on 10,000 queries. Statistics (+33%) and fluency (+29%) follow.
GEO, Princeton/KDD 2024
Lower-ranked sites benefit disproportionately
Rank-5 sites saw +115% visibility improvement from citing sources, while rank-1 sites saw -30%. Generative engines can be more democratic than traditional search for well-optimized content.
GEO, Princeton/KDD 2024
AI search overwhelmingly favors earned media
AI search engines cite independent third-party sources 72-92% of the time, compared to only 18-27% for brand-owned content and virtually 0% for social content.
GEO, Toronto 2025
Training data frequency more than doubles accuracy
Models are more than twice as accurate on questions whose answers appear frequently (51+ documents) in training data vs. rarely (1-5 documents). Being in training data AND retrievable compounds advantage.
NanoKnow, 2026
YouTube is the #1 social citation source for LLMs
YouTube's share of social citations doubled from 18.9% to 39.2% in just 5 months. It generates 18x more AI citations than Instagram and 50x more than TikTok. Views and subscriber counts do not predict AI citation.
Adweek / Bluefish / Emberos / Goodie AI, 2025
Video LLMs are trained on transcripts, not visual content
A 7B model trained on YouTube transcripts outperformed 72B models. No captions = invisible to LLMs. Transcript quality is the dominant factor, not production value.
LiveCC, CVPR 2025
Content position follows a U-shaped attention curve
LLMs reliably use content at the beginning and end of their context window but effectively ignore the middle. Front-loading key information is critical for citation.
Lost in the Middle, TACL 2024
AI citation accuracy is surprisingly poor
Perplexity achieves only 49% citation accuracy; You.com 68%; BingChat 66%. 23-32% of relevant statements have no source backing. Systems display more sources than they actually use.
False Promise of Source-Cited Responses, 2024
Reddit is the #2 social citation source and foundational training data
Reddit accounts for 10-40% of AI social citations depending on platform/timeframe. WebText (GPT-2 training) was built from 8M Reddit posts with 3+ karma. Reddit remains pervasive in Common Crawl and is actively licensed to Google ($60M/yr) and OpenAI ($70M/yr).
Multiple sources, 2024-2025
Community consensus creates unique credibility signals
Upvoted comment threads, especially "best X for Y" recommendation discussions, create multi-user validation that LLMs weight heavily. This multi-user signal cannot be replicated by single-author content.
Bluefish Labs / Emberos, 2025
AI Overviews strongly favor top-ranked pages
76% of Google AI Overview citations come from top-10 organic pages, with median cited position at rank 3. But standalone LLMs (ChatGPT, Claude, Gemini) show only 12% overlap — they cite fundamentally different sources.
Ahrefs, 2025
Web mentions outperform backlinks for AI visibility
Brand web mentions (0.664 correlation) and YouTube mentions (0.737) are far stronger predictors of AI citation than backlinks (0.37). Top 25% brands by web mentions get 12x more AI Overview mentions than the 50-75th percentile.
Ahrefs 75K Brands Study, 2025
AI assistants strongly prefer fresh content
Content cited by AI assistants is 25.7% newer on average than traditional search results. 65% of AI bot crawl hits target content less than 1 year old. Freshness signals can shift LLM ranking positions by up to 95 places.
Ahrefs (17M citations) + arXiv, 2025
AI crawlers are growing explosively
GPTBot grew 305% YoY, with OpenAI's crawl-to-referral ratio at 1,700:1. Each major AI company now runs 3 separate bots (training, indexing, user-fetch). Blocking training bots while allowing search bots is a valid strategy.
Cloudflare, 2025
Model updates can collapse citation volume overnight
When GPT-5.3 replaced GPT-4o as the default model, unique domains cited per response dropped 20.5% and unique URLs dropped 21%. The study formalizes two distinct visibility types: parametric (stable, from training data) and dynamic (volatile, from real-time retrieval) — and shows that model updates can sharply reduce the latter.
Resoneo, 2026
Brand network centrality outweighs raw mention frequency
A 200,000-query study of LLM parametric memory found that brands densely associated with high-authority peers rank higher than brands with more raw mentions. A brand with zero spontaneous recall ranked #1 because of its network position among luxury category leaders. Being mentioned alongside the right brands matters more than being mentioned often.
Dejan AI / Resoneo, 2026

Put this research to work

LLM Optimizer applies these research findings automatically to analyze and optimize your brand's visibility across AI search engines.

LLM Optimizer is open-source (MIT). Our hosted version supports ongoing development.