7-Step Agentic Pipeline: Claude Code for Autonomous Publishing

Architecture Overview

The DEEPCONTEXT system treats Claude Code as an editorial team rather than a chatbot, implementing a seven-step pipeline that transforms one headline into up to five finished articles. The architecture functions like a newsroom with strict editorial hierarchy.

Layer 1: Intelligence

Before the LLM processes a headline, a Python script (crosslink.py) using multilingual-e5-large embeddings computes similarity against every published article. This creates a "briefing" containing similar articles, matching verified facts, existing clusters, and persona coverage gaps. The system uses Z-scores instead of raw cosine similarity to normalize against the corpus distribution in this domain-specific context (geopolitics, economics, science). A Z-score of 3.5 indicates 99.9th percentile similarity, likely signaling a duplicate.

Layer 2: Editorial Decisions

The main Claude Code agent reads the briefing and makes several editorial calls:

Analyze: Identifies 6-10 knowledge gaps the headline opens up
Route: Decides between NEW_CLUSTER, EXTEND, UPDATE, or SKIP options
Regionalize: Checks which global regions are directly affected (not just mentioned)
Persona Assignment: Selects which of five writer personas should tackle which angle
Dedup: Cross-references planned articles against the archive post-persona assignment

The routing step provides editorial discipline, allowing the system to stop the pipeline if content is already sufficiently covered.

Layer 3: Parallel Writing

The main agent launches up to five sub-agents simultaneously, each handling one article. Each sub-agent:

Loads its own persona file exclusively (saves tokens, prevents voice blending)
Structures the article with an outline including section goals
Writes a 2,000-3,000 word draft
Extracts every verifiable claim and classifies it (NUMBER, NAME, TECHNICAL, HISTORICAL, CAUSAL)

Sub-agents operate in isolation without intercommunication, with the main agent coordinating their work.

Layer 4: Three-Stage Fact-Checking

After draft completion, three preprocessing layers run before LLM verification:

Factbase match (crosslink.py factmatch): Compares extracted claims against 1,030+ verified facts from previous articles. High-confidence matches auto-verify without re-checking.
Wikipedia/Wikidata match (crosslink.py wikicheck): Checks structured data from Wikidata and text from Wikipedia lead sections using a local database (no API calls).
Web search: Only for claims unmatched in factbase or Wikipedia, cutting web searches by approximately 70%.

Verdict categories include CORRECT, FALSE, IMPRECISE, SIMPLIFIED, and UNVERIFIABLE. FALSE claims require immediate fixing, while more than three UNVERIFIABLE claims prevent publication.

Layer 5: Translation & Publishing

Translations occur only from the fact-checked final version, never from drafts. A Python publishing script handles database inserts, link creation, and embedding computation in one command.

System Metrics

The system has produced:

246 articles published across 25 topic clusters
Content in 8 languages: English (always), plus German, Spanish, French, Portuguese, Arabic, Hindi, Japanese, and Indonesian where regionally relevant
1,030 verified facts in the growing factbase with automatic expiry (economic facts = 3 months, historical = never)
5 distinct personas with measurably different writing styles

📖 Read the full source: r/ClaudeAI