Running Claude Code as a Pure Judgment Engine Across the Full SDLC

A developer on r/ClaudeAI detailed their months-long setup using Claude Code (the runtime with tool use and multi-turn loop) across the full software development lifecycle — tickets, cross-repo implementation, code review, MRs, and a persistent knowledge layer.
Key architectural decision: Keep Claude Code out of orchestration. Plain Python handles all mechanical work: Jira API calls, git operations, test runs, lint, file moves. Claude Code is only invoked for judgment — writing code, evaluating review findings, choosing between architectural options. The author found that mixing the two (letting the agent orchestrate via tool use) made the first version slow, expensive, and non-deterministic.
Concrete lifecycle of one ticket:
- Python orchestrator: Pulls the Jira ticket, searches the local wiki for related architectural decisions, sets up a worktree on a fresh branch, assembles a 30–50 line implementation brief (acceptance criteria, target files, callers of modified shared functions, relevant standards). Outputs a JSON bundle.
- Claude Code: Reads the brief and writes the code. This is the only step with significant token consumption.
- Python + review subagent: Runs tests, lint, format. On failure, hands back to the implementation agent (max 3 retries). Then dispatches a code-review subagent configured with no Edit or Write permissions — it can only read and report findings.
- Python: Creates a proposal in a dashboard. After manual approval, the orchestrator pushes and creates the MR.
Specific Claude-Code techniques that mattered:
- Subagent isolation. The review agent runs in its own context window with a deny-list (Edit, Write). Splitting review and implementation caught behavioral changes in shared code that the implementation agent kept missing.
- Pre-assembled briefs beat dynamic exploration. Early on, letting Claude Code explore the codebase before implementation ate noticeably more tokens than handing it a focused brief assembled by Python (Jira fetch, wiki search, dependency analysis).
- Skill/command routing via YAML rather than letting the agent decide. The mapping from
/ticket,/review,/standupetc. to orchestrators is explicit, so capabilities are inspectable instead of emergent. - Hooks gate commits. A pre-commit hook runs lint and format before any commit Claude Code attempts. Violations block the commit; the agent must fix them.
Wiki layer: Markdown pages with three confidence tiers (verified, inferred, human-provided) and field-level staleness thresholds. Without the tiering, agents treat their own past inferences as truth and compound hallucinations into authoritative-looking knowledge.
Struggles still being tackled:
- Cross-repo features: the agent loses coherence when a feature spans services, even with structured change-set tracking.
- Vague tickets: the agent produces reasonable but often wrong implementations from ambiguous specs. The author now flags ambiguous tickets as blockers.
- Scope creep: over-engineering instinct requires constant calibration via standards and the review agent.
- Long sessions: earlier context falls out of effective attention; session-start re-initialization mitigates but doesn't eliminate it.
📖 Read the full source: r/ClaudeAI
👀 See Also

OpenClaw user automates cross-platform content formatting with custom skill
A developer built an OpenClaw skill that automatically formats raw drafts for multiple platforms, eliminating manual markdown adjustments for each site's specific requirements.

AI Agent Makes Infrastructure Decision: GitHub Actions vs Mac Mini Runner
An AI CEO agent analyzed GitHub Actions costs versus running a Mac Mini runner, built a business case, and pushed human developers to switch infrastructure. The agent made a real infrastructure call based on cost analysis.

OpenClaw Setup Combines Local Models, OpenAI, and n8n for Cost-Effective AI Operations
A developer shares their OpenClaw configuration using OpenAI via OAuth for high-quality reasoning, local models for daily tasks, and n8n for automation workflows, keeping monthly costs around $20.

AI Agents Running a Real E-commerce Business: Practical Insights from an Implementation
An AI agent system operates an actual e-commerce store, handling design, coding, marketing, and customer operations without human task execution. The implementation reveals that judgment calls like design rejection thresholds and incident prioritization present harder challenges than technical agent coordination.