Slash Claude costs 60x by offloading mechanical tasks to DeepSeek V4 Flash via MCP

A Reddit user analyzed their Claude usage and found the bulk of it went to mechanical tasks: classifying files, reformatting JSON, pulling fields from text, and summarizing docs they'd skim anyway. None of that needed Sonnet. The fix: a small cheap model running as a side worker via MCP, plus a single rule in CLAUDE.md telling Claude not to do those tasks.
Setup: an MCP tool + CLAUDE.md deny-list
The setup uses a single MCP tool that sends text and gets text back. Default model is DeepSeek V4 Flash (cheap, 1M context). The endpoint is one config line and works with any OpenAI-compatible provider (local ollama, vllm, lm studio). The repo is github.com/arizen-dev/deepseek-mcp (MIT, Python 3.10+).
The critical piece: the CLAUDE.md rule uses negative framing — a deny list, not a permission list. The user reports positive framing ("use DeepSeek for X") got ignored ~30% of the time. The deny list approach catches it reliably.
# In CLAUDE.md:
# do NOT use Claude for:
# - json formatting
# - field extraction
# - file classification
# - summarization you will review anyway
Results: 60x cost reduction
Over 3 weeks of real usage: 217 mechanical calls offloaded to DeepSeek V4 Flash, total spend $0.41. Same workload on Sonnet would have been roughly $7. That's a ~17x multiplier on just those tasks, and the user says overall bill dropped 60x when factoring in heavier tasks still on Sonnet.
How the side worker operates
The side worker is a supervised tool, not an agent — no tool calls, no file access, no chains. Latency is 3–25 seconds. You review the output. The whole shape is: send text, get text back, review, move on.
Who it's for
Developers using Claude API or Claude Code who want to cut spend on high-volume mechanical tasks without losing Sonnet's reasoning for complex work.
📖 Read the full source: r/ClaudeAI
👀 See Also

Bug Hunt: WireGuard Crashes and MTU Mismatch in GKE
Lovable engineers traced user errors to anetd crashes from a concurrent map access panic in Google's WireGuard integration, then found a secondary MTU mismatch after disabling encryption.

OpenClaw 5.28: Codex Plugin Broken After Upgrade — Fix with Symlink Shim
OpenClaw 5.28 breaks Codex plugin due to binary path mismatch. Fix: create symlink from expected path to actual bin/codex.

Replacing OpenClaw's Default Memory with Redis and Qdrant for Production Multi-Agent Systems
A developer replaced OpenClaw's default SQLite memory with Redis for ephemeral state and Qdrant for persistent vector memory to solve scaling issues in multi-agent setups, implementing semantic search, cross-agent sharing, and concurrent writes.

Building a serverless AI agent platform on AWS for $0.01/month with Claude Code
A developer built a complete AWS serverless platform running AI agents for approximately $0.01/month using Claude Code over 29 hours, eliminating expensive components like NAT Gateway ($32/month) and ALB ($18/month). The project includes 233 unit tests, 35 E2E tests, and deploys with a single cdk deploy command.