Research Citations

The research underpinning LLM Optimizer's analysis methodology. All scoring frameworks, dimension weights, and recommendations are derived from peer-reviewed academic work and validated practitioner research.

Research Digest

Brand Recognition vs. Discovery. A key framework throughout LLM Optimizer is the distinction between brand recognition — how well AI represents your brand when people search for it by name — and inbound discovery — how often AI surfaces your brand when people search your category without prior knowledge of you. Both matter, but they require different strategies. Brand recognition improves through authority signals, earned media, and training data presence. Discovery requires appearing in category-level content, answering the questions your audience asks before they know you exist, and being present in the YouTube videos, Reddit threads, and web pages that LLMs cite for category queries.

The emerging science of LLM visibility reveals a fundamental shift in how information gains authority online. The most significant recent finding comes from NanoKnow (2026), which demonstrates that content appearing frequently in training data more than doubles a model's accuracy on related questions — and that the advantage compounds when content is both memorized during training and retrievable at inference time. This means the traditional SEO playbook of optimizing for a single ranking algorithm is being replaced by a dual imperative: getting into training corpora through widespread, high-quality publication, while simultaneously remaining citable through structured, authoritative web presence.

Across the research, a consistent pattern emerges: AI search engines overwhelmingly favor earned media over brand-owned content, citing third-party sources 72-92% of the time. Content that includes quotations from authoritative sources gains +41% visibility — the single most effective optimization technique identified. Meanwhile, YouTube has rapidly become the dominant social citation source for LLMs, with its share doubling to 39% between August and December 2024. Critically, video LLMs process content through transcripts, not visual analysis — a 7B model trained on YouTube transcripts outperformed 72B models, proving that transcript quality matters far more than production value.

Reddit has emerged as the #2 social citation source for LLMs, with unique authority dynamics. Reddit was foundational in LLM training through datasets like WebText and the Common Crawl, and continues through $60M (Google) and $70M (OpenAI) annual licensing deals. Unlike YouTube's channel-centric authority, Reddit's influence comes from multi-user validation — upvoted comment consensus, especially in "best X for Y" recommendation threads, creates credibility signals that LLMs weight heavily. The Toronto GEO paper classifies Reddit as "Social" — a category AI search engines suppress in direct citations — yet Reddit's pervasive presence in training data means it heavily shapes baseline model knowledge even when not explicitly cited.

A critical "two-world" split has emerged between Google AI Overviews and standalone LLMs. 76% of AI Overview citations pull from top-10 organic pages — making traditional search rankings the primary signal for AIO inclusion. But for standalone LLMs like ChatGPT, only 12% of cited URLs rank in Google's top 10. The strongest predictor of AI citation across platforms is YouTube mentions (0.737 correlation), followed by web mentions (0.664) — not backlinks. Meanwhile, content freshness has become a significant signal: AI assistants cite content that is 25.7% newer than traditional search results, and 65% of AI bot crawl hits target content less than a year old. The explosive growth of AI crawlers (GPTBot up 305% YoY) makes robots.txt policy a direct lever for AI visibility.

However, this new landscape comes with important caveats. Citation accuracy across AI answer engines remains surprisingly poor (49-68%), with nearly a third of claims lacking any source backing. Citation concentration follows power-law dynamics, where the top 20 sources capture 28-67% of all citations. And LLMs exhibit strong positional bias, reliably attending to content at the beginning and end of context while ignoring the middle.

Compounding these challenges, model updates can sharply reduce citation volume. When GPT-5.3 replaced GPT-4o as ChatGPT's default, unique domains cited per response dropped 20.5% overnight — meaning brands that had achieved dynamic visibility through real-time retrieval lost it without any change on their end. This volatility reinforces the importance of parametric visibility (being embedded in training data) alongside dynamic visibility (being citable at inference time). Research into LLM parametric memory reveals that network centrality — being densely associated with high-authority brands in a model's knowledge graph — outweighs raw mention frequency. A brand that appears alongside category leaders in training data gains disproportionate visibility, even if it is mentioned less often overall. Together, these findings inform LLM Optimizer's scoring frameworks across answer optimization, video authority, Reddit authority, and search visibility analysis.

Source Papers

Lost in the Middle: How Language Models Use Long Contexts

TACL 2024

Position bias in LLM context windows — U-shaped attention curve where content at the beginning and end is reliably used while middle content is ignored.

GEO: Generative Engine Optimization

Princeton / KDD 2024

Tested 9 content optimization strategies on 10,000 queries. Quotations (+41%), statistics (+33%), and fluency (+29%) are the most effective methods for improving LLM citation visibility.

NanoKnow: Probing LLM Knowledge by Linking Training Data to Answers

2026

Training data frequency more than doubles model accuracy. Even with oracle RAG, models score ~11 points higher on questions with answers in training data.

GEO: How to Dominate AI Search — Source Preferences

U of Toronto 2025

AI search engines cite earned media 72-92% of the time vs. 18-27% for brand-owned content. AI citations overlap with Google results only 15-50%.

YouTube vs Reddit AI Citations

Adweek / Bluefish / Emberos / Goodie AI, 2025

YouTube appears in 16% of LLM answers (vs. 10% for Reddit). YouTube's social citation share doubled from 18.9% to 39.2% between Aug-Dec 2024.

News Source Citing Patterns in AI Search Systems

2025

Citation concentration and gatekeeping dynamics across 366K citations. Top 20 sources capture 28-67% of all citations (Gini 0.69-0.83).

LiveCC: Learning Video LLM with Streaming Speech Transcription

CVPR 2025

How video LLMs are trained from ASR transcripts. A 7B model trained on YouTube transcripts surpassed 72B models, proving transcript quality matters more than model size.

The False Promise of Factual and Verifiable Source-Cited Responses

2024

Citation accuracy ranges 49-68% across answer engines. 23-32% of claims have no source backing. Perplexity generates one-sided answers 83.4% of the time.

Language Models are Unsupervised Multitask Learners

OpenAI, 2019 (Radford et al.)

Introduced WebText, a dataset of 8 million Reddit posts with 3+ karma score, as the foundational training corpus for GPT-2. Demonstrated that Reddit's community curation mechanism (karma voting) effectively serves as a quality filter for large-scale language model training data.

Consent in Crisis: The Rapid Decline of the AI Data Commons

ACM FAccT 2024 (Longpre et al.)

Comprehensive audit of AI training data sources documenting Reddit's persistent prominence in Common Crawl and other web corpora. Found that robots.txt restrictions increased 25%+ from 2023-2024 as sites restricted AI crawling, while Reddit data remained broadly available through licensing agreements.

Reddit Data Licensing: Google and OpenAI Deals

Reuters / The Verge, 2024

Google pays $60M/year and OpenAI $70M/year for Reddit data access. Reddit's API was locked down in 2023. Active litigation: Reddit v. Anthropic, Reddit v. Perplexity (scraping claims).

Community Consensus as LLM Authority Signal

Bluefish Labs / Emberos Research, 2025

Reddit's multi-user validation (upvotes, comment consensus) creates credibility signals single-author content cannot match. "Best X for Y" recommendation threads are among the most influential for LLM comparison queries.

AI Overview Citations and Search Rankings

Ahrefs, 2025

76% of AI Overview citations pull from top-10 organic pages. Median organic ranking for a cited URL is position 3. 86% of citations come from within the top 100 organic results.

AI Search Overlap: How AI Citations Differ from Google

Ahrefs, 2025

Only 12% of standalone LLM citations overlap with Google's top 10. Perplexity shows 28.6% overlap. 80%+ of ChatGPT/Claude/Gemini citations come from pages not ranking in Google at all.

AI Brand Visibility Correlations (75K Brands)

Ahrefs, 2025

YouTube mentions (0.737) and web mentions (0.664) are the strongest correlators with AI visibility. Brand search volume (0.334) outperforms backlinks (0.37). Top 25% brands get 12x more AIO mentions.

Do AI Assistants Prefer to Cite Fresh Content?

Ahrefs, 2025 (17M citations)

AI assistants cite content 25.7% newer than traditional search. ChatGPT: avg 1,023 days old. Perplexity pulls ~50% from current year. Google AIOs counter-trend: prefer older authoritative content.

AI Brand Visibility and Content Recency

Seer Interactive, 2025

65% of AI bot crawl hits target content published within the past year. 85% of AIO citations from last 2 years. 94% from last 5 years.

Do Large Language Models Favor Recent Content?

arXiv, September 2025

LLMs consistently promote "fresh" passages. Top-10 mean publication year shifts forward by up to 4.78 years. Individual items move up to 95 ranking positions based on recency signals alone.

From Googlebot to GPTBot: Who's Crawling Your Site

Cloudflare, 2025

GPTBot grew 305% YoY. OpenAI crawl-to-referral ratio: 1,700:1. Anthropic: 73,000:1. ~21% of top-1000 sites block GPTBot. Training crawls = 80% of AI bot activity.

AI Overviews Study: 200,000 Keywords

Semrush, 2025

Reddit (40.1%) and Wikipedia (26.3%) dominate AIO citations. 80% of AIO responses target informational queries. 82% appear for keywords with <1,000 monthly searches.

ChatGPT Search Visibility: GPT-5.3/5.4 Citation Analysis

Resoneo, 2026

27,000 responses across 400 prompts over 14 weeks. After GPT-5.3 launched, unique domains cited per response dropped 20.5% (19.1 → 15.2) and unique URLs dropped 21.0% (24.1 → 19.1). Formalizes the distinction between parametric visibility (training data knowledge) and dynamic visibility (real-time web retrieval).

Brand Authority Index: Network Centrality in LLM Parametric Memory

Dejan AI / Resoneo, 2026

Queried Gemini 200,000 times across ~20 million brand mentions, building a 2.9 million-node directed association graph. Found that network centrality — being densely associated with high-authority brands — outweighs raw mention frequency for parametric visibility. A brand with zero spontaneous recall ranked highest due to dense intersections with authority brands.

Answer Optimization Scoring Framework

Each optimization report scores how likely an LLM is to surface and cite a website's answer across four research-backed dimensions.

Content Authority30%

Source: GEO (Princeton/KDD 2024)

Measures the presence of quotations from authoritative sources (+41% visibility), statistical evidence (+33%), source citations (+28%), fluency (+29%), and technical terminology (+19%). Penalizes keyword stuffing (-9%).

Structural Optimization20%

Source: Lost in the Middle (TACL 2024) + GEO (Toronto 2025)

Evaluates answer prominence (front-loaded vs. buried), content conciseness, machine-readable structure (Schema.org, tables, comparison formats), and justification language that explains "why" rather than just "what."

Source Authority30%

Source: GEO (Toronto 2025)

Assesses third-party coverage and earned media presence. AI search engines cite earned media 72-92% of the time. Evaluates cross-engine consistency since different AI providers cite substantially different sources (similarity only 0.11-0.58).

Knowledge Persistence20%

Source: NanoKnow (2026)

Measures how deeply information is embedded in model training data. Answer frequency more than doubles accuracy. Content that is both in training data AND retrievable at inference compounds advantage by ~11 percentage points. Clear, educational writing outperforms natural text by 19+ points.

Video Authority Scoring Framework

Video analysis evaluates YouTube presence across four pillars, grounded in the finding that LLMs process video through transcripts, not visual content.

Transcript Authority30%

Source: LiveCC (CVPR 2025) + GEO (Princeton 2024)

Transcript quality is the dominant signal for LLM visibility. Evaluates keyword alignment, quotability (standalone citable statements get +41% visibility per GEO), information density, and caption availability. Videos without captions are effectively invisible to LLMs.

Topical Dominance25%

Source: AI Search Arena (2025) + GEO (Toronto 2025)

Measures topic coverage breadth and depth, share of voice across video content in the space, content gaps representing first-mover opportunities, and coverage depth (surface vs. in-depth treatment). Winner-take-all dynamics mean being first in a topic gap has outsized value.

Citation Network25%

Source: AI Search Arena (2025) + YouTube Citation Analysis (Adweek 2025)

Analyzes who mentions the brand, their authority level, and concentration risk. Top 20 sources capture 28-67% of all AI citations. A mention by a high-authority channel outweighs dozens of small-channel mentions. Human engagement metrics (views, subscribers) do not predict AI citation.

Brand Narrative Quality20%

Source: False Promise of Source-Cited Responses (2024) + Lost in the Middle (2024)

Evaluates sentiment, mention context and position (early mentions get priority per U-shaped attention), extractability (clear mentions are less likely to be misrepresented given 49-68% citation accuracy), and narrative coherence. Includes a confidence discount reflecting known citation inaccuracy rates.

Reddit Authority Scoring Framework

Reddit analysis evaluates community discussion across four pillars, grounded in Reddit's unique role as a multi-user validation platform for LLM training data.

Presence25%

Source: Reddit Training Data Analysis (2024-2025) + GEO (Toronto 2025)

Volume and breadth of brand mentions across relevant subreddits. Measures total mentions, unique subreddits reached, and mention trend over time. High presence in topic-specific subreddits carries more weight than general discussion.

Sentiment & Recommendations25%

Source: Community Consensus Research (Bluefish/Emberos 2025)

Community tone and recommendation strength. Evaluates positive/negative sentiment balance, recommendation rate in "best X for Y" threads, and the specific praise/criticism themes that shape LLM perception.

Competitive Positioning25%

Source: GEO (Toronto 2025) + Reddit Community Analysis

Head-to-head positioning against competitors in comparison threads. Measures win rate, cited differentiators, and competitor advantages not countered — these directly shape LLM comparison responses.

Training Signal Strength25%

Source: NanoKnow (2026) + Reddit Data Licensing (2024)

Likelihood that Reddit discussions will influence LLM training. High-upvote threads in authoritative subreddits with deep comment engagement create the strongest training signals. Reddit data is actively licensed to OpenAI and Google.

Search Visibility Scoring Framework

Search visibility analysis evaluates how search-related signals affect whether AI systems will discover, index, and cite your content — bridging traditional SEO signals with AI citation dynamics. When Brand Intelligence provides category data, a fifth pillar (Category Discovery) measures whether people searching your category — without knowing your brand — can find you.

AI Overview Readiness30%

Source: Ahrefs AIO Citations Study (2025) + Semrush AIO Study (2025)

76% of AI Overview citations pull from top-10 organic pages. Evaluates organic ranking presence, structured data (Schema.org, JSON-LD), content format alignment with AIO-preferred informational queries, and answer prominence (front-loaded concise answers). AIOs favor long-tail keywords — 82% appear for terms with <1,000 monthly searches.

Crawl Accessibility20%

Source: Cloudflare AI Crawler Report (2025) + Consent in Crisis (ACM FAccT 2024)

GPTBot grew 305% YoY with a crawl-to-referral ratio of 1,700:1. Evaluates robots.txt policy for AI crawlers (GPTBot, ClaudeBot, PerplexityBot and their SearchBot variants), sitemap completeness, and render accessibility. Blocking training bots while allowing search bots is a valid strategy; blocking everything eliminates AI visibility.

Brand Search Momentum25%

Source: Ahrefs 75K-Brand Study (2025) + Google Trends API (2025)

Brand search volume has a 0.334 correlation with AI citation frequency — but web mentions (0.664) and YouTube mentions (0.737) are stronger. Winner-takes-all: top 25% brands average 169 AIO mentions vs. 14 for the 50th-75th percentile. Evaluates brand search trends, entity recognition, and competitive positioning.

Content Freshness25%

Source: Ahrefs 17M Citations Study (2025) + Seer Interactive (2025) + arXiv Recency Bias (2025)

AI assistants cite content 25.7% newer than traditional search. 65% of AI bot hits target content <1 year old. Freshness signals can move items up to 95 ranking positions in LLM reranking. Evaluates content age, update frequency, freshness signals (dates, last-modified), and content decay risk. Note: Google AIOs counter-trend, preferring older authoritative content.

Category Discovery20% (when categories available)

Source: Brand Intelligence categories + target queries

When Brand Intelligence provides category keywords and intent queries, a fifth pillar evaluates how visible the brand is in category-level searches — queries where users search the category without knowing the brand. This measures discovery potential: does the brand appear when someone searches "best [category] tools" or "[use case] solutions"? Sub-metrics include category visibility, intent coverage, competitor gap, and discovery potential. Weights rebalance to 25/15/20/20/20 across all five pillars.