Savant Commander 48B: 12 Distilled Models in MoE Qwen 3

Savant Commander 48B is a custom Mixture-of-Experts (MOE) model built on Qwen 3 architecture that combines 12 distilled models from various providers including Claude, Gemini, OpenAI, and Deepseek. The model uses hand-coded routing to isolate each distill while allowing connections between them simultaneously.

Key Features and Architecture

Based on Qwen 3 with 256K context length
4x12B MOE structure (48B total parameters)
Custom routing isolates each distilled model while maintaining inter-model connections
Prompt-controlled activation - users can select which distilled model(s) to use
Enables direct comparison between different distilled models using identical prompts

Model Variants and Availability

The project includes both regular and uncensored ("Heretic") versions. The uncensored version was created by applying the Heretic process to each individual model before adding them to the MOE structure, rather than applying it to the entire MOE.

Available GGUF formats:

Regular version: https://huggingface.co/DavidAU/Qwen3-48B-A4B-Savant-Commander-GATED-12x-Closed-Open-Source-Distill-GGUF
Uncensored version: https://huggingface.co/DavidAU/Qwen3-48B-A4B-Savant-Commander-Distill-12X-Closed-Open-Heretic-Uncensored-GGUF

Source repositories:

Regular: https://huggingface.co/DavidAU/Qwen3-48B-A4B-Savant-Commander-GATED-12x-Closed-Open-Source-Distill
Uncensored: https://huggingface.co/DavidAU/Qwen3-48B-A4B-Savant-Commander-Distill-12X-Closed-Open-Heretic-Uncensored

Practical Applications

The model's prompt-controlled routing allows developers to test and compare outputs from different distilled models using the same prompts. Command and control functions are documented in the repository card with detailed instructions.

This approach to MOE architecture provides a practical way to leverage multiple specialized models within a single inference framework, particularly useful for comparing model behaviors or selecting specific model characteristics for different tasks.

📖 Read the full source: r/LocalLLaMA

Savant Commander 48B: A Custom Qwen 3 Mixture-of-Experts Model with 12 Distilled Models

Key Features and Architecture

Model Variants and Availability

Practical Applications

👀 See Also

Routing Claude API traffic to control costs after Max subscription change

Claude Code v2.1.143: Plugin Dependency Enforcement, PowerShell Defaults, and Background Session Fixes

Project Headroom: Netflix Engineer's Open Source Tool Slashes AI Token Costs by 90%

Scaling Karpathy's Autoresearch with 16 GPUs: Results and Methods