LLM Circuit Finder: Duplicate 3 layers to boost reasoning without training

The llm-circuit-finder toolkit implements and extends David Ng's RYS method to discover and exploit 'reasoning circuits' hidden inside transformer models. The core finding: certain contiguous blocks of layers act as indivisible cognitive units. Duplicating them in the forward pass - same weights, no training, no merging - makes models measurably smarter on specific capabilities.
Key Results
Devstral-Small-2-24B with layers 12, 13, 14 duplicated once:
- BBH Logical Deduction: 0.22 → 0.76 (+245%)
- GSM8K (strict): 0.48 → 0.64 (+33%)
- MBPP (code gen): 0.72 → 0.78 (+8%)
- Average improvement: +8% across all metrics with nothing degraded
Qwen2.5-Coder-32B with layers 7, 8, 9 duplicated once:
- Reasoning probe (causal + logic + nav): 76.5% → 94.1% (+23%)
How It Works
Transformers organize themselves during training into functional circuits - multi-layer processing units that perform complete cognitive operations. These circuits are indivisible: duplicating a single layer does almost nothing, but duplicating the right block of 3-4 layers gives the model a second pass through its reasoning pipeline.
Different models have different circuits in different places:
- Devstral-24B (40 layers): reasoning circuit at layers 12-14
- Qwen2.5-32B (64 layers): reasoning circuit at layers 7-9
The boundaries are sharp. Shift the block by one layer in either direction and the improvement disappears or inverts.
Different Duplication Patterns Create Different Modes
Same weights on disk, same VRAM for the base model, just different routing:
- Double-pass 13-16: Math ↑↑, EQ ↑
- Triple-pass 13-16: Math ↑, EQ ↑↑
- Interleaved 13,13,14,14,15,15,16: Math ↑↑↑, EQ ↓ (pure math mode)
- Quadruple-pass 13-16: Math —, EQ ↑↑ (EQ mode, math neutral)
Quick Start
Find circuits in your model:
pip install gguf requests tqdm
python sweep.py \
--model /path/to/model.gguf \
--llama-server /path/to/llama-server \
--tmpdir /dev/shm/rys \
--results pass.jsonl \
--block-sizes 3 4 5 \
--stride 1 \
--start-min 10 --start-max 20 \
--skip-baseline \
--port 8099 \
--server-args --device Vulkan1,Vulkan2
Apply a known circuit:
# Duplicate layers 12-14 in Devstral
python layer_path.py model.gguf improved.gguf \
-p " 0..14,12,13,14,15..39 " -v
Duplicate layers 7-9 in Qwen2.5-32B
python layer_path.py model.gguf improved.gguf
-p " 0..9,7,8,9,10..63 " -v
Triple-pass example
python layer_path.py model.gguf experiment.gguf
-p " 0..16,13,14,15,16,13,14,15,16,17..39 " -v
Validate with established benchmarks:
# Start the server with modified model
llama-server -m improved.gguf --port 8089 -ngl 99 --device Vulkan1,Vulkan2
# Run lm-evaluation-harness
The entire discovery process - sweep, discovery, validation - was done on two AMD consumer GPUs (RX 7900 XT + RX 6950 XT) in one evening.
📖 Read the full source: HN LLM Tools
👀 See Also

Claude Code v2.1.143: Plugin Dependency Enforcement, PowerShell Defaults, and Background Session Fixes
Anthropic released Claude Code v2.1.143 with plugin dependency enforcement, PowerShell -ExecutionPolicy Bypass, new worktree isolation option, and numerous fixes for background sessions, Windows Terminal, and macOS file access.
Surgical GitHub Extraction: A Claude Skill to Fetch One Function, Not the Whole Repo
A new open-source Claude Skill named surgical-github-extraction stops Claude Code from cloning entire repos when you only want one function or pattern. It reads the README, pulls 1–3 raw source files, and lifts the smallest useful unit with a provenance comment.

OpenUtter: Query Google Meet Transcripts Live via OpenClaw
OpenUtter is a skill that joins Google Meet as a guest via a headless browser, captures live captions, and streams them to your OpenClaw event bus. You can query the live transcript mid-call via Telegram, WhatsApp, Slack, or Discord.

Open Source AI Agent Prompt Library Reaches 100 GitHub Stars
A community repository called ai-setup provides shared system prompts, Cursor rules, Claude configs, and local model workflow setups for AI agents. The project has 100 GitHub stars and 90 merged PRs.