How to Use a Local LLM as a Claude Code Subagent

A developer on r/LocalLLaMA demonstrates how to use Claude Code to delegate tasks to a local LLM running via LM Studio, reducing Claude's context usage by keeping file content local.

How It Works

The system uses a small Python script (~120 lines, standard library only) that runs an agent loop:

You pass Claude a task description without file content
The script sends it to LM Studio's /v1/chat/completions endpoint with read_file and list_dir tool definitions
The local model calls those tools itself to read the files it needs
The loop continues until it produces a final answer
Claude sees only the result, not the file content

Example Usage

python3 agent_lm.py --dir /path/to/project "summarize solar-system.html"
# [turn 1] → read_file({'path': 'solar-system.html'})
# [turn 2] → This HTML file creates an interactive animated solar system...

The file content goes into the local model's context (tested with Qwen3.5 35B 4-bit via MLX on Apple Silicon), not Claude's.

What It's Good For

Code summarization and explanation
Bug finding
Boilerplate / first-draft generation
Text transformation and translation (tested with Hebrew)
Logic tasks and reasoning (use --think flag for harder problems)

What It's Not Good For

Tasks that require Claude's full context, such as multi-file understanding where relationships matter
Tasks needing the current conversation history
Anything where accuracy is critical

The author describes it as "a Haiku-tier assistant, not a replacement."

Setup

LM Studio running locally with the API server enabled
One Python script for the agent loop, one for simple prompt-only queries
Both wired into a global ~/.claude/CLAUDE.md so Claude Code knows to offer delegation when relevant
No MCP server, no pip dependencies, no plugin infrastructure needed
Recommendation: Add {%- set enable_thinking = false %} to the top of the jinja template - for most tasks this saves time and tokens without quality degradation

The author notes they had Claude help write the post but with supervision and corrections, and is happy to share the scripts if there's interest.

📖 Read the full source: r/LocalLLaMA