Multi-Agent Video Production Pipeline with Claude: Script Contract Architecture and Research Fanout

A developer built a multi-agent AI pipeline that takes a topic (e.g., "Ada Lovelace") and a persona (channel identity, tone, visual style) and produces a complete chapter-structured educational YouTube video (15–20 min). The pipeline uses Claude as the core LLM for scripting and orchestrates specialized agents across script writing, asset generation, rendering (CUDA on Windows host), and YouTube upload.
Script Writing via Contract Architecture
To keep a 20-minute AI-written script narratively coherent across chapters written in separate LLM calls, the system uses a narrative contract — a validated JSON blueprint generated before any script text is written. The contract encodes four constraint types:
- Threads — story arcs that must open in one chapter and close in another, with a declared payoff type (resolved, tragedy, etc.)
- Entities — named people/places with a forced first-introduction chapter, preventing retroactive mentions
- Facts Required — citations chained with dependencies (fact B can't appear until fact A is established)
- Timeline Anchors — temporal reference points allowing non-linear structure (flashback, in-medias-res) while staying internally consistent
The contract is generated via an Opus → structural validate → Sonnet review loop (up to 3 rounds). Sonnet checks semantic coherence (no orphan entities, threads actually close); the structural validator runs a Pydantic parse + temporal constraint check. Downstream chapter writers are bound to the contract.
Research via Fanout
The research pipeline spins up N parallel OutlineAgent instances, each working from the same research package but on different thesis candidates. Each produces a three-level hierarchy: thesis → chapter arguments → scene beats. A grounding/revision loop runs independently on each branch:
- Grounding reviewer (Sonnet) flags blocking issues vs. polish issues
- Revision agent applies fixes without restructuring
- Quality reviewer checks for structural failures (topical chapter lists, collapsed middles, summary endings)
Up to 3 revision rounds per branch, in parallel. Then a single judge agent scores each refined outline on four axes:
| Axis | Weight | What it measures |
|---|---|---|
| Concept Hook | 0.40 | CTR potential; title falsifiability |
| Trap Closure | 0.30 | Narrative payoff completeness |
Pipeline Architecture
The pipeline is split across two environments: script and asset work runs in a Linux dev container (WSL), while rendering runs on the Windows host to access CUDA and video tooling. Agents communicate over HTTP with a lightweight orchestrator. The system is phase-based — every step (W2.1, W4.3, R3.1, etc.) is independently re-runnable. Each phase reads and writes typed artifact files (JSON manifests, audio files, image directories) so agents are loosely coupled.
Integrated tools: Live2D, Fish Audio, Sadtalker, and others for asset generation and rendering.
📖 Read the full source: r/ClaudeAI
👀 See Also

Real-World MCP Server Use Cases and Development Patterns
A developer shares their experience building an MCP server that connects to live sports data scanners, pulling odds from sportsbooks to find pricing inefficiencies in real-time. They discuss practical lessons learned about tool design and installation formats.

Practical OpenClaw Setup Patterns from Real-World Deployments
A Reddit user shares insights from setting up OpenClaw for 10+ non-technical users, revealing that successful deployments typically involve 1-2 messaging apps, 5-10 simple workflows, local Mac operation, and voice cloning as a key adoption driver.

RunLobster AI agent builds functional dashboard from natural language request
A developer reports that RunLobster built and deployed a complete dashboard with Stripe integration and authentication in response to a single natural language command, completing in minutes what would normally take days.

Garlic Farmer Builds 19K-Line AI Agent System on Android Phone
A Korean garlic farmer has built a 19,260-line Python AI agent system called 'garlic-agent' that runs entirely on an Android phone using Termux. The system rotates between multiple AI providers, saves context in SQLite, and uses a manual copy-paste workflow for development.