A TDD Development Flow Using AI Agents for Website Projects

Development Workflow with AI Agents
A developer outlines their approach to website development using AI coding agents with a test-driven development methodology. They use both Claude Code for work projects and local models for private projects, specifically Qwen Code on top of Qwen3.5-27B running on llama.cpp with 2xRTX 3090 GPUs.
Initial Project Setup
At the beginning of a project, they implement basic modules:
- Basic DB schema
- Basic auth API
- UI routing
- UI basic layout
- Basic API (admins and users)
- Basic API/E2E tests (written manually or by AI)
- Context files for coding agents (AGENTS.md, CLAUDE.md)
Iterative Development Process
After setup, the iterative process begins:
- Write detailed specs of API/E2E tests in markdown for a feature
- Generate API/E2E tests from the markdown test descriptions
- Start coding agent session with ability to run tests
- Ask agent to implement functionality until tests pass
Model Capabilities and Trade-offs
The developer notes that more capable models like Claude allow skipping markdown files entirely for simple websites, while Qwen3.5-27B has different thresholds. Less capable models require more specific instructions to mitigate failure modes, including locking logic by instructing not to touch certain files or using only specific wrappers.
They hypothesize that developers shouldn't be obsessed with code patterns and quality if code is covered by tests and works, comparing AI agents to managing 10-100 junior/middle developers at the cost of an AI subscription.
Local Model Specifics
For local models running on 2xRTX3090, they use Qwen3.5-27B-GGUF-Q8_0 with parallel = 1 and full context, believing this is important for agentic sessions not to be autocompressed early. They note that dumber models force clearer articulation of E2E tests and desired implementation, while Claude fills in design choices automatically but can lead to loss of control.
Coding TDD Loop Implementation
The developer provides a draft of their coding TDD loop:
outer loop begins: run all pytest tests using command `pytest tests/ -x` and will exit there aren't any failures; the default loglevel will be warning, so not much output there
if everything passes; exit the outer loop; if something failed, extracts failed test name
runs the failed test name with full logs, like `pytest tests/../test_first_failing_test.py --log-level DEBUG` and collects the output of the tests into the file
extracts lines near the 'error'/'fail' strings with `egrep -i -C 10 '(error|fail)' <failThis approach represents a practical implementation of TDD with AI agents, balancing automation with necessary oversight to maintain codebase control.
📖 Read the full source: r/LocalLLaMA
👀 See Also

Using Claude as a Ruthless UI/UX Reviewer with Specific Persona Prompt
A Reddit user shares a prompt that transforms Claude into a brutal UI/UX consultant who reviews live apps in two passes: first as a ruthless designer, then as a first-time user, outputting findings in a prioritized markdown file.
Three Minds: A Framework for Human + Two AI Agents Working Together
A Reddit user describes a human-AI collaboration pattern using two Claude agents with different contexts: one for daily operations, one for specialized domain expertise. The human provides direction and final decisions.

Student Builds Personal Wealth Advisor with Claude Code CLI
A 19-year-old student built a personal wealth advisor system using Claude Code CLI that pulls live market data, macro indicators, and news, then generates institutional-grade analysis with memory tracking. The open-source tool runs on a Claude Max subscription without API costs.

Claude Excel Add-on User Review: Practical Experience with Spreadsheet Tasks
A construction company owner reports positive results using Claude's Excel add-on for updating quote and job costing spreadsheets, noting error detection and UI improvement suggestions.