inclusionAI Releases Ling-2.6-1T: Hybrid Architecture Trillion-Parameter Model with Sparse Attention and Fast Thinking

✍️ OpenClawRadar📅 Published: April 29, 2026🔗 Source
inclusionAI Releases Ling-2.6-1T: Hybrid Architecture Trillion-Parameter Model with Sparse Attention and Fast Thinking
Ad

inclusionAI has open-sourced Ling-2.6-1T, a trillion-parameter flagship model from the Ling family, targeting complex real-world tasks. The model introduces a hybrid architecture combining Multi-head Latent Attention (MLA) and Linear Attention to improve inference efficiency, lowering latency and VRAM usage for long contexts while keeping expressivity.

Fast Thinking via Reward Strategy

Post-training uses a Contextual Process Redundancy Suppression reward strategy, which encourages shorter, direct outputs — a "fast thinking" mechanism that reduces reliance on verbose chains-of-thought. This cuts token overhead while maintaining performance.

Ad

Benchmark SOTA

Ling-2.6-1T achieves open-source SOTA on execution-heavy benchmarks:

  • AIME26 (reasoning)
  • SWE-bench Verified (software engineering)
  • BFCL-V4 (function calling)
  • TAU2-Bench (task completion)
  • IFBench (instruction following)

Agent Integration

The model is designed for end-to-end engineering workflows — from code generation to bug fixing — and integrates with mainstream agent frameworks including Claude Code, OpenClaw, OpenCode, and CodeBuddy. It handles multi-tool, multi-step constraints in enterprise environments.

📖 Read the full source: r/LocalLLaMA

Ad

👀 See Also