← Back to blog
Published: 25 Mar 2026Updated: 1 Apr 20268 min readBy Bump Research Team

How AI platforms choose which businesses to cite

In short: AI platforms like ChatGPT and Perplexity choose which businesses to cite based on structured data presence, content authority, review volume and recency, and llms.txt files. Businesses with comprehensive Schema.org markup appear significantly more often in AI-generated recommendations (Princeton GEO Study, KDD 2024).

The black box, illuminated

When someone asks ChatGPT “who is the best estate agent in Chelsea?”, the model doesn’t search the web in real-time (unless using a search tool). Instead, it draws on patterns from its training data and any retrieval-augmented generation (RAG) systems connected to it.

This means the information that was available, structured, and authoritative when the model was trained — or when its search tool crawled the web — determines who gets cited.

What we found: the citation factors

After analysing thousands of AI-generated responses about local businesses, clear patterns emerged. The businesses that get cited most consistently share several traits:

Structured data presence: Businesses with comprehensive Schema.org markup (LocalBusiness, Review, FAQPage) appear significantly more often in AI citations.

llms.txt files: This emerging standard — a machine-readable file that tells AI crawlers about your business — is already being adopted by forward-thinking businesses.

Review volume and recency: AI platforms weight recent, diverse reviews heavily. A business with 200 reviews from the last 6 months will outperform one with 500 reviews that stopped 2 years ago.

Content depth: Businesses with detailed, expert content on their websites — area guides, market reports, buyer guides — are cited more frequently than those with thin marketing copy.

The formatting factor

One surprising finding: how your content is formatted matters almost as much as what it says. AI models parse well-structured HTML more effectively than walls of text. Use clear heading hierarchies (H1 → H2 → H3), bullet points for lists, and FAQ sections with proper Schema markup.

Pages that answer specific questions directly (“What is the average house price in Fulham?”) are more likely to be quoted verbatim by AI platforms.

Building your citation profile

Think of your AI citation profile as a new form of reputation. It’s built from the same raw materials as traditional SEO — content, reviews, structured data — but optimised for a fundamentally different audience. Instead of a crawler indexing keywords, you’re building a knowledge base that an AI model can understand, trust, and recommend.

Start by auditing your current visibility. A Bump scan checks all of these factors and gives you a prioritised action plan.

Frequently asked questions

How does ChatGPT choose which businesses to recommend?

ChatGPT draws on patterns from its training data and retrieval-augmented generation systems. Businesses with structured data (Schema.org markup), deep authoritative content, recent diverse reviews, and llms.txt files are cited significantly more often.

What is llms.txt?

llms.txt is a machine-readable file placed at the root of your website that tells AI crawlers about your business. It is an emerging standard that forward-thinking businesses are adopting to increase AI discoverability.

Does Schema markup help with AI visibility?

Yes. Research from Princeton University’s KDD 2024 study found that businesses with comprehensive Schema.org markup (LocalBusiness, Review, FAQPage) appear 30–40% more often in AI citations.

See where your business stands

Get your free Bump Score in 30 seconds. No account required.

Scan your website free