OmniCoder-9B: 9B Parameter Coding Agent Fine-Tuned on 425K Agentic Trajectories

Tesslate has released OmniCoder-9B, a 9-billion parameter coding agent model fine-tuned on top of Qwen3.5-9B's hybrid architecture. The architecture uses Gated Delta Networks interleaved with standard attention.
Training Data and Sources
The model was trained on 425,000+ curated agentic coding trajectories spanning real-world software engineering tasks. The training data was specifically built from Claude Opus 4.6 agentic and coding reasoning traces, targeting scaffolding patterns from:
- Claude Code
- OpenCode
- Codex
- Droid
The dataset includes successful trajectories from models like Claude Opus 4.6, GPT-5.4, GPT-5.3-Codex, and Gemini 3.1 Pro.
Key Features
- Trained on Frontier Agent Traces: Built from Claude Opus 4.6, GPT-5.3-Codex, GPT-5.4, and Gemini 3.1 Pro agentic coding trajectories across Claude Code, OpenCode, Codex, and Droid scaffolding
- Hybrid Architecture: Inherits Qwen3.5's Gated Delta Networks interleaved with standard attention for efficient long-context processing
- 262K Native Context: Full 262,144 token context window, extensible to 1M+
- Error Recovery: Learns read-before-write patterns, responds to LSP diagnostics, and applies minimal edit diffs instead of full rewrites
- Thinking Mode: Supports <think>...</think> reasoning chains for complex problem decomposition
- Apache 2.0: Fully open weights, no restrictions
Agentic Behavior
The model shows strong agentic behavior learned directly from the real-world agent trajectories it was trained on. It recovers from errors using read-before-write patterns, responds to LSP diagnostics, and uses proper edit diffs instead of full rewrites.
The model is available at https://huggingface.co/Tesslate/OmniCoder-9B.
📖 Read the full source: r/LocalLLaMA
👀 See Also

Qwen 3.6 27B hits 2.5x speed with MTP speculative decoding on llama.cpp
A Reddit user reports 2.5x faster inference on Qwen 3.6 27B using MTP speculative decoding with a custom llama.cpp PR, achieving 28 tok/s on Mac M2 Max 96GB. Includes pre-converted GGUF quants and fixed chat templates.

Distilled Qwen 3.5 27B Model Shows Strong Performance with Cursor AI Coding Agent
A user reports that the opus 4.6 distilled version of Qwen 27B works effectively as the model driving Cursor, with performance comparable to Gemini 3 Flash. Setup took about 10 minutes using Cursor to configure ngrok tunnel and localllama.

AI Chat Exporter: A Chrome Extension for High-Fidelity Claude Conversation PDFs
A developer built AI Chat Exporter, a Chrome extension that preserves math, code, and images when exporting Claude conversations to PDF. The tool uses a local browser-based rendering engine developed with Claude 3.5 Sonnet to handle progressive markdown and LaTeX formatting.

Noren AI: Voice Extraction Tool Identifies Writing Patterns from Samples
Noren AI analyzes 5-10 writing samples to automatically generate a voice guide based on actual patterns, matching 90% of manually identified patterns and discovering additional ones.