Citevera methodology

How we score AI search readiness

The Citevera headline overall score is a weighted rollup of a five-stage AI decision funnel. Each stage scores how the audit items in that stage map onto the question the stage is asking. The headline is the funnel; the funnel bars on your report are the inputs.

The five stages

Listed in funnel order, with the rollup weight applied to the headline. Weights chosen via N=50 production scan calibration (2026-04-26); see the cutover note below.

Detection

weight 30%

Can AI find and identify your site at all? Robots.txt access for GPTBot, ClaudeBot, PerplexityBot, and Gemini. Presence of llms.txt and llms-full.txt. Sitemap reachability. Crawl-time response codes. The first stage of the funnel because nothing downstream matters if AI cannot reach the page.

Understanding

weight 25%

Once AI fetches the page, can it parse what is there? Heading hierarchy, schema.org coverage (Organization, Person, FAQPage, HowTo, Article, Product), meta tags, content extractability. The audit prompt asks specifically whether the structure makes the topic explicit, not whether the page exists.

Coverage

weight 20%

Does the site demonstrate enough breadth on the topic that AI engines treat it as authoritative on the subject? Number of pages per topic cluster, internal-link structure, hub-and-spoke organization, content depth on the primary topic. The 33x rule: sites with 50+ blog posts average 33x the AI crawler traffic of sites with no blog.

Conversion

weight 15%

When an AI engine refers a visitor to the site, can that visitor act? Working CTAs, visible pricing or contact paths, schema-marked Products or Services with Offers, no broken forms. Conversion is downstream of detection and understanding - if AI never sends the visitor, the conversion path does not matter, but if it does, this is where the funnel either closes or leaks.

Trust

weight 10%

Does the site carry enough trust signal that AI engines treat it as a credible source rather than just a relevant one? sameAs linkage to LinkedIn, Crunchbase, Wikidata; Organization schema; Review, Rating, and Testimonial markup; verifiable identity. Weighted at 10% reflecting the structural-coverage gap in the current trust scorer (only tag and id substring hits contribute weight; categories do not). This weighting will be revisited when the trust audit revision lands.

Methodology change - 2026-04-28

Why your headline number changed

On 2026-04-28 we replaced the old headline formula with the funnel rollup described above. The old formula was a weighted average of three holistic axis ratings (40% AEO readiness, 40% GEO readiness, 20% crawlability) the audit model emitted alongside the audit items. We discovered those axis ratings ran systematically optimistic compared to the audit items themselves - across a calibration sample of 50 production scans, 76% of headlines moved by more than 5 points and 64% moved by more than 20 points when re-derived from the audit items via the funnel rollup.

The audit items are the ground truth - they are the concrete present, weak, and missing signals the audit surfaces, and they are what we ship as fixes. Trusting the holistic axis ratings over the audit items meant the headline was disagreeing with itself. The new formula reads the audit items directly so the headline reflects what the audit actually says.

Historical scans keep their stored headline so trend charts before the cutover are unchanged. Scans run on or after 2026-04-28 use the new formula. Trend charts render a small dotted boundary where the formula transitions, so a March-vs-May comparison does not silently mix two formulas. The audit items, fix recommendations, and per-stage funnel bars are unchanged - only the headline math moved.

Re-tuning cadence

We re-audit the weights when any of the following holds: the trust audit revision lands (trust at 10% reflects a current structural-coverage gap, not a philosophical claim), production scan population exceeds ~200 complete scans, the audit prompt changes meaningfully (new categories, revised severity rubric), or six months elapse since the last calibration. The re-tune playbook lives in the repository at scripts/calibration/README.md.

Per-stage status bands

Each stage carries an emotional status label alongside its 0-100 number, derived from the band the score falls into. Same bands apply to the headline overall score.

86 to 100: dominant - the strongest possible reading on this stage.
71 to 85: competitive - this stage is working in your favor.
51 to 70: understandable - present but with room to push higher.
21 to 50: weak - real gaps are surfacing here.
0 to 20: not usable - the audit items in this stage describe a critical absence.

Questions or pushback? Email the team. The audit and the methodology are auditable; we want the disagreements.