Methodology · Audit-grade contract

Every number traces to its source filing.

Four pillars. Cohort-percentile thresholds. Verbatim verifier. Citation tokens with freshness timestamps. The grade refuses to compute when the source can't be verified — and says so.

Home/Methodology
The contract

Audit-grade output, by design.

PublicWeave is built on five non-negotiable rules. Every UI surface — report cards, peer comparisons, the WeaverAI assistant — must respect them.

Rule 01 · verbatim

Every claim is verbatim from a source

If the dollar figure isn't physically present in the underlying audited filing, the platform refuses to assert it. No inference, no LLM rounding, no "approximately."

Rule 02 · tier gates

Tier A / B / C confidence

Every row carries an audit tier. Tier A = fully verified verbatim from a primary source. Tier B = partial gaps documented. Tier C = held in staging for review. The chip is on the report card.

Rule 03 · provenance

Citation tokens with timestamps

Every WeaverAI answer carries [ref:source@2026-04-30T12:00Z] citations. The verifier refuses unknown citations. Source freshness is checked against the live data_source_state registry on every request.

Rule 04 · provisional

Provisional flag when inputs are thin

Below 80% of expected inputs present, the grade publishes with a "provisional" chip. No silent gaps. Visible to every reader, every export, every shared report.

Rule 05 · refusal

The platform says "I don't know"

When peer cohort is below the min-30 floor, when a source is stale beyond its SLA, when an audit hasn't been filed — the platform shows the gap. It doesn't fill it with a guess.

Rule 06 · audit defensibility

Every grade is reproducible

Run the same inputs through services/grade_engine.py and you get the same letter. Pure cohort math, golden tests, no LLM in the grading loop. Reviewable by an auditor without us in the room.

The four pillars

Every non-school grade rolls up from four pillars.

Cohort-percentile thresholds — A is the top decile of your peer cohort, F is the bottom. No absolute floors that age. No subjective weighting.

Pillar 01 · 30 points

Recovery

Operating revenue ÷ operating expenditures. Excludes bond proceeds, transfers, capital grants. The rating-agency-grade definition of operating self-sufficiency.

Pillar 02 · 25 points

Balance

Net surplus or deficit relative to operating spending, derived from revenue − expenditures (not the extracted "Change in Net Position"). Caught the Schaumburg AHPD board-memo bug — see live grade.

Pillar 03 · 20 points

Capital

All-funds capital investment as a share of total spending. Includes bond-funded capital from capital_plan_projects when published — captures investment outside the operating budget.

Pillar 04 · 25 points

Debt

Debt service ÷ operating spending. Cohort-percentile against same-type same-state peers. Bond paydown years (one-time drops) are flagged via agency notes — see SPD FY2026 paydown context for the pattern.

The cohort math

Every percentile rank, defensible.

Cohort definition

{entity_type} · {state} · {size_quintile}. A 200-acre Illinois park district is ranked against other 200-acre Illinois park districts — not against a 5,000-acre municipal system in California.

Min-30 floor with fallback

If the state cohort has fewer than 30 peers, the cohort widens (drop size quintile, then drop state, then drop entity type) until 30 are present. Citation per Wang/Dennis/Tu 2007.

Winsorization at [5,95]

Top and bottom 5% of the cohort are clipped before percentile computation, so a single billion-dollar Houston outlier doesn't crush the rest of the cohort. Citation per Mead 2001.

Bootstrap 95% CI on percentile

The grade card shows your percentile and a 95% confidence band ("87 — 96"). When the band is wide, the cohort is small. Citation per Efron 1979.

Bounded discovery

Eight deterministic patterns, refreshed nightly.

No LLM-fabricated anomalies. Every "watch" or "alert" on a report card is a SQL pattern with named source columns and a clear severity rule.

rev_drop_levy_rise

Total revenue dropped while tax levy rose — likely fund-scope mismatch or one-time loss.

audit_age_warning

Most-recent audited filing is N years old. Newer CAFR upload tightens grade confidence.

rev_per_capita_outlier

Revenue per capita is N standard deviations from cohort mean.

capital_doubled_no_cip

Capital expenditures jumped sharply year-over-year without a published capital plan.

fac_going_concern

Federal Audit Clearinghouse flagged a going-concern paragraph or material weakness.

fund_balance_negative

Unassigned fund balance went negative — GFOA recommends 2+ months operating reserves.

debt_service_high

Debt service exceeds peer-cohort 90th percentile.

data_quality_alert

Extraction failure or missing input flagged for human triage.

Industry calibration

Benchmarks practitioners recognize.

No proprietary scoring. No rating-agency-equivalence claims. Every threshold cites a published standard.

Moody's

Bond-rating criteria inform Debt Service Coverage (1.2× minimum) and structural-balance thresholds.

GFOA

Government Finance Officers Association best-practice guidance on fund balance, capital ratios, debt limits, GASB 54 classification.

NRPA

Park-specific benchmarks: cost recovery (30–40%), capital reinvestment, program revenue. Peer medians published nationally.

ISBE / NCES

Illinois State Board of Education Financial Profile + NCES F-33 federal school finance survey for school district models.

Type-specific models

One grade doesn't fit all agencies.

Park districts, libraries, and municipalities use the four-pillar model. School districts use a separate scorecard built from federal and state academic data.

Park Districts · Libraries · Municipalities · Special Districts

Four-pillar model: Recovery (30) + Balance (25) + Capital (20) + Debt (25) = 100. Cohort-percentile thresholds against same-type same-state peers. Park districts surface a Cost Recovery Rate sub-metric (NRPA 30–40%); libraries weight Per-Capita Investment more heavily.

School Districts · Five-dimension scorecard

A separate model: Academic (30) + Climate (20) + Teaching (20) + Fiscal (15) + Equity (15) = 100. Built from the Illinois Report Card public data set + NCES F-33 federal survey covering all 14,536 US school districts. See sub-metrics on the School Districts page.

Data sources · 10 registered

Every number, traceable.

Each source is registered in data_source_state with last-fetch and last-success timestamps. WeaverAI cites the timestamp on every answer; the verifier refuses unknown citations.

State comptrollers / auditors

Illinois Comptroller · CA SCO ByTheNumbers · Indiana Gateway · Ohio Auditor · Colorado OSA — primary audited source for governmental-funds totals.

NCES F-33

Annual Survey of School System Finances — 14,536 LEAs nationwide, FY2022–2023, structured federal data covering revenue and expenditure by source/function.

NCES EDGE

LEA → Place geographic overlay (TIGER 2023) — 55,224 LEA-place edges. Federally authoritative source for cross-agency relationships.

IMLS PLS

Public Libraries Survey — 9,252 libraries nationwide. Visits, circulation, staffing, collection counts.

Federal Audit Clearinghouse

Single audit findings — going-concern paragraphs, material weaknesses, significant deficiencies. Refreshed hourly via stateful resumer.

ISBE Report Card

Illinois State Board of Education public data set — 858 districts, academic + financial profile, zero API cost.

US Census of Governments

24,887 government units indexed. Primary registry for entity identification and population.

Operational data

ActiveNet · Brightly — for districts that integrate, programs / registrations / facilities / asset conditions feed directly into operational sub-pillars.

Agency-submitted: Audited CAFRs, budget books, community surveys — uploaded by directors to improve their grades and correct extraction errors. Routed through Tier-A/B/C gates before they hit the live grade.

Director-confirmed context: Agency notes (bond paydown explanations, capital cycle context, restructuring events) are human-vetted before they appear next to a grade. They never change the math — only the interpretation.

Never used as primary data: Survey opinions · trade-press rankings · crowdsourced reviews · LLM speculation. The platform refuses to fabricate.

See the methodology applied.

Find your agency