GEOSEOAI SearchSchemaContent Strategy

Generative Engine Optimization (GEO): how to get cited in ChatGPT, Claude, and Perplexity

AI engines cite pages that lead with a direct quotable answer, use question-style headings, include FAQPage and Article structured data, and provide an llms.txt index. Classic SEO still matters, but the citable structure is different from what traditional SEO optimizes for.

Alexey YushkinFounder, GENERAL INFORMATICS2 min read

AI engines cite pages that lead with a direct quotable answer, use question-style headings, include FAQPage and Article structured data, and provide an llms.txt index. Classic SEO still matters because the engines crawl the open web. The citable structure is different from what traditional SEO optimizes for, and most pages getting cited in 2026 were not built with GEO in mind. They just happen to be well structured.

This article is the implementation-level checklist we use when building sites that should be citable in ChatGPT, Claude, Perplexity, and Google AI Overviews. None of it is theoretical. We use this on geninfos.com, and you can view source on any of the article pages to see the structure.

The eight things that move the needle

In order of impact, based on what is observable when you study which pages actually get cited.

1. A direct, quotable answer in the first paragraph

The first two to three sentences of the article should answer the implied search query in a way that stands alone if quoted. AI engines lift these openings verbatim or near-verbatim. If your opener is "In this article we will explore the topic of...", the engine has nothing to quote.

The fix is brutally simple. Write the answer first. Save the introduction for later. If the page is "How to deduct home office expenses," the first paragraph names the IRS form, states the percentage formula, and lists the eligibility criteria. Period. Then expand below.

2. Question-style H2 headings

H2s that match the way people phrase queries get cited more than H2s that read like marketing taglines. "How long does a Massachusetts building permit take?" gets cited. "Permit timelines, unpacked" does not. Engines do entity matching on heading text and use H2s as candidate Q&A pairs.

This is a rewrite, not a redesign. Look at your existing H2s and ask whether each one answers a question. If not, rewrite as a question or as a direct answer phrasing.

3. FAQPage structured data

Schema.org FAQPage markup is the highest-impact piece of structured data for GEO right now. The engines treat each Q&A pair as an isolated, citable unit. We put 3 to 6 real questions at the bottom of every article, marked up as FAQPage JSON-LD.

The catch: the questions and answers must be visible on the page, not just in the schema. Hidden FAQPage schema is against Google's structured data guidelines and can earn a manual action.

4. Article (or BlogPosting) structured data with citations

For long-form content, BlogPosting markup with author, datePublished, dateModified, and (critically) a citation array of the sources you cite. The citation field is underused. It tells engines that this article references other authoritative sources, which both validates the content and gives the engine a graph of related material.

We bake citation arrays into every article on this site, generated from the article's frontmatter sources field. View source on this page and search for "citation" to see it.

5. BreadcrumbList on every page

BreadcrumbList JSON-LD lets engines understand site hierarchy. It is also a Google rich result. Tiny effort, real lift. Every non-homepage on this site emits BreadcrumbList, including this one.

6. An llms.txt at the site root

llms.txt is a proposed markdown index at the root of a domain, structured to make the site legible to LLMs. No engine has publicly committed to using it as a ranking signal. Anthropic has hinted Claude reads it in some search modes. Adoption will likely grow.

It costs almost nothing. Generate it at build time, listing your services, articles, and proof projects in clean markdown. Ours lives at /llms.txt and is regenerated on every deploy. Adding this is a 30-minute task with zero downside.

7. Citable specifics, not vague claims

Engines weight pages with concrete specifics higher than pages with vague claims. "Most small businesses see automation pay back in 3 to 6 months" is citable. "Automation typically pays back fairly quickly" is not. Use real numbers, named tools, real form numbers, real prices. The same content can be either citable or not depending on the writing.

This is the rewrite that pays for itself faster than any of the schema work. Comb through existing pages and replace vague claims with specifics, even if the specifics are ranges.

8. Sources and citations visible at the bottom

A "Sources" or "Citations" section at the end of an article, with real links to authoritative pages you actually used, does two things. It strengthens E-E-A-T signals for classic SEO. It gives AI engines a graph of validated references that the engine can use to weight your page's credibility.

Always link to the primary source, not a summary. If the data is from the IRS, link to irs.gov. If it is from Schema.org, link to schema.org. Wikipedia is acceptable as a starting point but not as a primary source.

What does not move the needle as much as people think

Word count. AI engines do not reward length the way classic SEO sometimes did. A 1500-word article that answers the question precisely beats a 3500-word article with a lot of padding. Stop optimizing for length.

Keyword density. Hammering the keyword into the page does not improve citation rate. The model is reading meaning, not counting tokens. Write naturally about the topic.

Domain authority for citations. As noted in the FAQ, AI engines synthesize from multiple sources and will quote a specific page on a small site alongside a Wikipedia article if the specific page has the better quote. Big-brand sites still have an advantage in classic SEO. The gap is narrower in GEO.

How to measure it

Tracking GEO is harder than tracking SEO. There is no Google Search Console for ChatGPT citations. We use three signals.

One, direct queries to the major engines on topics where we should be cited, run periodically, with the citations recorded. Manual but tractable for a small site.

Two, referral traffic from chat.openai.com, perplexity.ai, gemini.google.com, and similar in analytics. Low-volume but the trendline is meaningful.

Three, brand mention monitoring across these engines. A brand mentioned by an engine even without a link is GEO traffic that does not show up in referral logs, but it indicates citation.

None of this is as clean as Search Console. Treat it as a separate channel with its own measurement, not as a sub-metric of SEO.

A worked example

Look at the source of this page. The article hits every checklist item. Question-style H2s. Quotable opener. FAQPage schema rendered from the faq field of the article's frontmatter. BlogPosting schema with citation array. BreadcrumbList. Linked sources at the bottom. Citable specifics throughout.

The site's article infrastructure is a Vite + MDX pipeline that bakes all of this into static HTML at build time, so AI engines see the full structured content without executing JavaScript. The same MDX pipeline and authoring spec drive every article on this site.

Where to start

If you are reading this with one article on your site, do these four things in order. Rewrite the opener to be a direct, quotable answer. Convert H2s to question form where possible. Add FAQPage schema with 3 to 5 real questions and answers. Add a sources section at the bottom with real links.

Those four changes, on the pages that already get traffic, will move citation rate faster than any new content you can publish. If you want help applying this to a site, we run a free SEO and GEO review for operators who want a clear punch list before they commit to the work.

Frequently Asked Questions

SOURCES & CITATIONS

  1. Schema.org documentation: FAQPage, Article, BreadcrumbList Schema.orghttps://schema.org/
  2. llms.txt proposal and specification llmstxt.orghttps://llmstxt.org/

About Alexey Yushkin

Alexey is the founder of GENERAL INFORMATICS LLC. He designs and ships AI and automation systems for small businesses and operators across the US.

Related reading

Want this kind of system in your business?

We build practical AI and automation systems for operators. Send us your current workflow and we will show you what to automate first.

Request a Workflow Review