Deep Dive: Integrity Review with Automated Revision

After Stage 3 (Writing) and Stage 4 (Visuals) produce an HTML draft, Stage 5 receives that draft and the research brief from Stage 2. Its job is to verify that every factual claim in the draft traces to a source in the research brief's source index. ^[1]

The mechanism is straightforward. During writing, the agent embeds invisible HTML comment markers alongside each claim. The format is a comment containing [Source: type:path:detail], where type is one of the four valid source types, path is the file or URL, and detail identifies the specific entry. The integrity reviewer extracts every such marker from the draft, looks up each one in the source index, and classifies the result. ^[3] Markers can appear in HTML comments, in <cite> elements, or as raw inline text — the extractor handles all three formats and deduplicates by position. ^[3]

The reviewer also hunts for claims that have no marker at all. ^[2] Using pattern-matching on HTML element content, it identifies sentences containing factual assertions — percentages, capability verbs, specific technology names — and flags them as unsourced under rule SV-001. This is what separates integrity review from simple citation checking: it catches things the writer forgot to cite, not just incorrectly cited things. ^[2]

Weighted Claim Types and the Scoring Formula

Not all claims carry equal weight. A wrong metric — claiming the pipeline takes 10 seconds when it actually takes 45 — does more damage than an unsourced general description. The scoring model reflects this with four claim types and explicit weights: ^[2]

Metric — weight 1.5×. Percentages, latency numbers, counts, multipliers. ^[2]
Product Capability — weight 1.2×. Claims about what the software supports, enables, or handles. ^[2]
Architecture — weight 1.0×. Implementation choices, frameworks, databases, APIs. ^[2]
General — weight 0.8×. Descriptive claims, framing, context. ^[2]

Two bar charts: claim type weights (metric 1.5x, product capability 1.2x, architecture 1.0x, general 0.8x) and pass thresholds by strictness level (strict 95%, standard 85%, relaxed 70%) — Claim type weights (left) and minimum integrity score thresholds by strictness level (right). The marketing-assistant configuration uses strict mode, requiring a 0.95 score to pass.

Classification uses regex pattern matching. ^[2] The metric detector matches percentage patterns and unit-bearing numbers. The capability detector looks for verbs such as supports, enables, provides, handles, integrates. Architecture detection catches specific technology names — PostgreSQL, Redis, REST, GraphQL — and structural terms. Everything else falls through to general. ^[2]

The integrity score is computed as: ^[1]

integrity_score = weighted_verified / weighted_total

Verified claims (severity: info) contribute their full type weight. Warning-level claims contribute half weight. Critical findings contribute zero. ^[1] A post with three verified metrics and one unsourced metric might score around 0.75: the unsourced metric carries a 1.5× weight with zero contribution, dragging down an otherwise clean draft.

Pass/fail thresholds vary by strictness level. ^[1] The marketing-assistant configuration uses strict, which requires a score at or above 0.95 to pass. The standard level requires 0.85, and relaxed requires 0.70. For published content on stephenbogner.com, strict is the only sensible choice — professional reputation is not something I'm willing to optimize away. ^[1]

The Five Verification Rules

The reviewer applies up to five rules (SV-001 through SV-005) to each claim, with the active set depending on strictness level: ^[2]

SV-001 — Unsourced claim. A factual assertion with no source marker. Active at all strictness levels. In strict mode, qualifying adjectives trigger this rule too. ^[2]
SV-002 — Broken reference. The marker exists but doesn't match any entry in the source index. Typically caused by a typo in the path or detail field. Active at all levels. Severity: critical. ^[2]
SV-003 — Invalid type. The source type in the marker is not one of the four valid values: source_code, documentation, web, or analytics. Active at all levels. Severity: critical. ^[2]
SV-004 — Low reliability. The cited source has a reliability score below 0.5. Active in strict and standard modes. Web sources carry a default reliability of 0.6; source code carries 0.9. Severity: warning. ^[2]
SV-005 — Indirect citation. A web source is cited where source code or documentation exists. Active in strict mode only. Severity: info — the claim still counts as verified; this is a best-practice nudge, not a failure. ^[2]

The lookup itself uses a two-tier strategy: it first tries a full key composed of all three parts, then falls back to a partial key (type and path only) if the detail field was omitted from the marker. ^[1] This makes citation marginally more forgiving of omitted detail fields while still enforcing that the path actually exists in the brief.

The Automated Revision Loop

When a draft fails integrity review, the pipeline does not immediately halt and ask for human intervention. Instead, it enters a revision loop with a maximum of three attempts. ^[1]

Each revision cycle follows the same sequence: run the integrity checker, produce a markdown report of all findings grouped by severity, hand the report to the writing agent with instructions to fix the flagged issues, then re-run the check. The report contains auto-fix suggestions keyed by finding type: ^[2]

broken_ref: Verify the source path/detail and correct the marker, or remove the claim if no matching source exists. ^[2]
invalid_type: Replace the type with a valid one, or escape the marker text in a code block if it appears in example prose. ^[2]
unsubstantiated: Add a source marker citing a real entry from the brief, or soften the claim to non-factual framing. ^[2]
low_reliability: Add qualifying language or replace with a higher-reliability source. ^[2]

If all three revision attempts are exhausted and the score still falls below the threshold, the pipeline halts with an error report — manual review is required. ^[1]

Line chart showing integrity score across revision attempts 0-3 for three post types: typical (passes at attempt 2), complex (passes at attempt 3), and clean draft (passes at attempt 1). Strict threshold at 0.95 shown as a red horizontal line. — Illustrative integrity score trajectory across revision attempts. Most posts pass by revision 1 or 2. Posts requiring all three attempts typically have structural sourcing gaps that the writing agent cannot automatically fix.

In practice on this project, posts rarely need more than one revision cycle. The research brief is dense enough that the writing agent usually cites correctly on the first pass; most failures are broken references caused by detail-field mismatches, which are easy mechanical corrections. ^[2]

Engineering Trade-offs: Automation vs. Manual Review

Automated integrity enforcement solves a specific, narrow problem with high consistency: verifying that every marked claim matches a real source entry. What it cannot do is verify that the source itself is accurate, that the claim is a fair characterization of that source, or that the overall post makes a coherent argument. ^[4]

This mirrors the broader landscape of AI content verification. Automated fact-checkers can cross-reference statements against known sources and flag anything that lacks backing. ^[5] But industry testing shows even well-designed systems achieve only modest accuracy on real-world mixed content. ^[6] Blog Creator sidesteps much of this uncertainty by operating on grounded sources — not open-ended fact-checking against the entire internet, but verification against the specific research brief assembled in Stage 2. ^[7] Grounded verification is far more tractable than open-world fact-checking.

The three-revision limit is a deliberate engineering choice. More retries would improve the odds of automatic resolution but would also extend pipeline runtime and risk revision drift — the writing agent fixing one finding while introducing another. ^[1] Three attempts balances automation coverage against reliability. Posts that genuinely fail after three cycles almost always have a structural problem — a claim present in the draft but absent from the research — that requires human judgment to resolve. ^[1]

Automation also excels at consistency in ways manual review cannot. A human editor reviewing fifty posts will apply slightly different standards each time, depending on context, fatigue, and familiarity with the subject. ^[8] The automated reviewer applies the same SV-001 through SV-005 rules to every post, every time. For a solo developer publishing under a professional engineer credential, that consistency is not just operationally convenient — it's professionally necessary. ^[2]

After Integrity Passes: Rendering Citations

Once a draft passes the integrity gate, the invisible source markers are converted into rendered footnotes: numbered superscripts in the body text and a "Sources" section at the bottom of the article. ^[9] This makes the sourcing visible to readers rather than hidden in HTML comments — consistent with the integrity-first principle that runs throughout the entire pipeline.

The citation renderer in the publisher handles this transformation with deduplication, so multiple claims citing the same source produce a single footnote entry. ^[9]

What This Means in Practice

The integrity review stage exists because fast, automated content creation and honest, verifiable content creation are not in conflict — but only if the quality gate is built in from the start, not bolted on as an afterthought. ^[1] The weighted scoring model, the five verification rules, and the three-attempt revision loop are the concrete implementation of that philosophy.

The next post in this series covers Stage 4 — the Visuals agent — and how it generates charts and diagrams from real research data rather than illustrative fabrications. The source for everything described in this post lives in the integrity_review.py module in the Blog Creator skill. ^[1]

Sources

integrity_review.py: IntegrityResult class (reliability: 0.9)
integrity_review.py: Finding class (reliability: 0.9)
integrity_review.py: _ExtractedMarker class (reliability: 0.9)
web: web-7 — Automated tools enforce standards consistently across conten
web: web-1 — Automated fact checkers can instantly cross-reference and ve
web: web-3 — Independent testing from 2025-2026 consistently shows no maj
web: web-5 — Grounding verification extracts claims from the response and
web: web-8 — Human expertise remains essential for nuanced decisions, whi
publisher.py: render_post function (reliability: 0.9)