Mistral Medium 3.5 128B Released: Dense Model with Configurable Reasoning and Vision

Mistral AI has released Mistral Medium 3.5 (128B), a dense transformer model that replaces Mistral Medium 3.1 and Magistral in Le Chat, and Devstral 2 in their coding agent Vibe. It's a single set of weights handling instruction-following, reasoning, and coding.
Key Features
- Dense 128B parameters — not Mixture of Experts.
- 256k context window for long inputs.
- Multimodal input: accepts text and images; outputs text only. Vision encoder trained from scratch to handle variable sizes and aspect ratios.
- Configurable reasoning effort: toggle per request between instant reply (
none) and deep reasoning (high). - Native function calling and JSON output for agentic workflows.
- Multilingual: supports English, French, Spanish, German, Italian, Portuguese, Dutch, Chinese, Japanese, Korean, Arabic, and others.
- Strong system prompt adherence.
Recommended Settings
- Reasoning effort:
nonefor quick replies;highfor complex prompts and agentic usage (e.g.,reasoning_effort="high"). - Temperature: 0.7 with
highreasoning; 0.0–0.7 withnonedepending on desired creativity.
License
Released under a Modified MIT License — open-source for commercial and non-commercial use, with exceptions for large revenue companies.
GGUF Quantizations Available
Unsloth has published a GGUF version on Hugging Face: unsloth/Mistral-Medium-3.5-128B-GGUF
This model is relevant for developers running local AI coding agents, particularly those needing high-quality instruction following, reasoning, and vision in a single dense model with a large context window.
📖 Read the full source: r/LocalLLaMA
👀 See Also

NTSB Pulls Docket After AI Recreates Dead Pilots' Voices from Spectrograms
Using Codex and Griffin-Lim algorithm, users reconstructed cockpit audio from NTSB spectrograms. NTSB pulled public docket in response.

Claude Code bug: automatic git reset destroys uncommitted changes every 10 minutes
Claude Code version 2.1.87 performs git fetch origin + git reset --hard origin/main on the user's project repository every 10 minutes via programmatic git operations, silently destroying all uncommitted changes to tracked files. The issue was closed as 'not planned' by Anthropics.

Lovable offers $100 free Claude API credits for International Women's Day
Lovable is giving away $100 in Anthropic Claude API credits, $250 in Stripe fee credits, and 24-hour free access to their platform through March 8. Users need to claim the offer before 12:59 AM ET on March 9.

Autoresearch Pushes Qwen3.5-397B to 20.34 tok/s on M5 Max via SSD Streaming
A developer achieved 20.34 tokens/second inference speed for the 209GB Qwen3.5-397B model on a MacBook Pro M5 Max with 128GB RAM using SSD streaming and 36 systematic experiments. The result represents a 2x speedup over the M5 Max baseline and 4.67x over the original M3 Max result.