The Citevera AEO Score Explained: What Each Axis Measures and Why

A breakdown of the seven axes Citevera measures, how each is weighted, what signals roll up into each, and how the composite score maps to real citation outcomes.

Why a composite score

Most AEO advice presents a long list of things to fix. A composite score does something different: it tells you whether the things you are fixing actually move citation outcomes, in what proportion, and where to look first.

The Citevera AEO score is calibrated against citation behavior. We measured citation outcomes across thousands of pages and worked backward to identify which structural signals best predicted whether a page would be cited. The score is the weighted sum of those signals.

The mapping is not deterministic - citation has stochastic elements - but it is strong enough that score improvements reliably precede citation lift. Score moves first; citation behavior follows.

The seven axes

The score breaks into seven axes. Each axis covers a distinct cluster of signals.

Crawlability (15%) - can AI crawlers reach the page. Robots.txt rules, server-side rendering, response status, sitemap presence, indexability flags.

Direct answer (20%) - is the question's answer present in the first 150 words. Heading-to-content alignment, opening-paragraph extractability, list-and-table density.

Schema (18%) - structured data completeness. FAQPage, HowTo, Article, BlogPosting, Person, Organization, BreadcrumbList. Validity, completeness, alignment with content.

Topical depth (12%) - cluster strength. Page count on the topic, internal linking density, entity coverage, term coverage.

Authority (12%) - source-credibility signals. Author markup with credentials, organization markup, sameAs links, dates, third-party citation patterns.

Freshness (12%) - recency signals. dateModified accuracy, content cadence, stale-content ratio.

Citation eligibility (11%) - structural cleanliness. Markup validity, no broken schema, content type matches query type, license-clean.

How the weights were chosen

The weights are not opinions. They reflect regression results: the proportion of citation-outcome variance each axis explains in our calibration data.

Direct answer (20%) is the largest because it most directly predicts citation. A page that fails on direct-answer fails on citation, regardless of how strong the other axes are.

Crawlability (15%) is gating. A 100% perfect page that is blocked by robots.txt has zero citation rate. The weight is high because the failure mode is binary.

Schema (18%) is high because schema-bearing pages cite at meaningfully higher rates across all engines. The signal is strongest on Claude and AI Overviews, weaker on Perplexity, but consistent enough to weight heavily.

Topical depth, authority, and freshness all sit at 12% because they each move citation by similar amounts in our data. They compound with each other but each alone has a smaller effect than the top three.

Citation eligibility (11%) catches the structural-hygiene issues that block citation even when other axes are strong. It is small in weight because most sites pass it; large in impact when violated.

What each axis is calibrated against

Each axis is calibrated against a specific citation outcome.

Crawlability is calibrated against "did the engine ever fetch this page in the last 30 days." If the engine never fetched the page, no other axis matters. The crawlability score directly predicts fetch presence.

Direct answer is calibrated against "did this page get cited on queries where its content was relevant." A page can be fetched but not cited because its answer is buried; the direct-answer score predicts the cited-when-relevant rate.

Schema is calibrated against "did this page get cited on engines that prefer structured content" (Claude and AI Overviews specifically). Schema correlates with citation rate even controlling for other signals.

Topical depth is calibrated against "does this site get cited consistently across many queries on this topic." Depth matters less for any single page and more for the brand's overall cited-on-topic rate.

Authority is calibrated against "do engines treat this source as authoritative" measured indirectly via citation rate on questions that benefit from authoritative sources (medical, financial, legal queries).

Freshness is calibrated against citation rate on engines that weight recency (Perplexity primarily, AI Overviews secondarily).

Citation eligibility is calibrated against "did the engine retrieve and rerank this page but ultimately not cite it" - the fail-at-the-last-step rate.

How the score maps to action

A score is most useful when it points at the next thing to fix. The Citevera score is structured so that the lowest sub-axis is the highest-leverage next move.

A site scoring 65 overall might have 80 on schema and 45 on direct answer. The right next move is direct answer, not schema - even though direct answer is "the same thing every blog post says to do." It is the same thing because it is correct.

The audit prioritizes by expected citation impact, not by ease. We surface the highest-impact fix first even if it is harder than the easy fixes. Time spent on low-impact-but-easy fixes is time spent suboptimally.

How the score maps to outcomes

A 0-50 score is a site that is not citation-ready. Citations are random and unreliable.

A 50-70 score is a site that gets cited occasionally on its strongest pages but inconsistently overall.

A 70-85 score is a site that gets cited reliably on most relevant queries. This is where most "we are doing AEO well" sites land.

An 85-100 score is a site that is structurally optimal. Further improvement comes from content depth and original research, not structural fixes.

The mapping is calibrated quarterly against fresh citation data. Score thresholds occasionally shift as engines update.

How Citevera scores this

The audit runs all seven axes on every audited page, computes the composite, and produces both a top-level score and a per-axis breakdown. The dashboard shows which axes are weakest and what specific fixes would close the gap.

The score is intentionally not just a number. It is a diagnostic. The number is useful for tracking progress; the per-axis breakdown is useful for deciding what to ship next.

Run a free Citevera audit to see your score and per-axis breakdown

Frequently asked questions

Is the score the same across all engines?

The base score is engine-agnostic - it measures structural readiness, which transfers across engines. Engine-specific overlays (which signals each engine prefers) are surfaced separately in the dashboard.

Does a high score guarantee citations?

No. The score predicts citation likelihood but does not determine it. Citation is a probabilistic event with engine-side stochastic elements. A high-score site will cite more often than a low-score site, but neither outcome is certain on any single query.

How often should I re-run the audit?

Quarterly for most sites. Monthly if you are actively shipping AEO changes. The signal stabilizes over a few weeks, so weekly audits show too much noise.

Can a site score well on the audit but poorly on actual citations?

Occasionally, when the brand is too new to have established representational presence in the engine. Structural readiness is necessary but not sufficient for new brands - you also need time and accumulated signals.

Is the score weighted differently for B2B vs. consumer sites?

The base weights are constant. The audit has industry-specific overlays that surface signals more relevant to specific verticals (e.g., LocalBusiness schema for service businesses, Product schema for ecommerce). The composite score itself stays calibrated against the same citation-outcome data.