Exploring Step 3.5 Flash: Open-Source Model for Fast Deep Reasoning

Step 3.5 Flash is an open-source foundation model focused on delivering fast and reliable deep reasoning capabilities. It uses a sparse Mixture of Experts (MoE) architecture, activating only 11 billion of its 196 billion parameters per token. This selective activation grants it high "intelligence density," allowing it to compete with top proprietary models while remaining agile for real-time interactions.
Deep Reasoning and Speed
The model incorporates 3-way Multi-Token Prediction (MTP-3), allowing it to process 100 to 300 tokens per second, peaking at 350 for single-stream coding tasks—ideal for complex, multi-step reasoning with quick responsiveness.
Performance in Coding and Agent Tasks
Step 3.5 Flash shines in agentic tasks, supported by a scalable reinforcement learning framework that ensures ongoing self-improvement. It achieved a 74.4% score on the SWE-bench Verified benchmark and 51.0% on Terminal-Bench 2.0, reflecting its capability in handling sophisticated, long-term tasks.
Efficient Long Context Processing
It supports a large 256K context window using a 3:1 Sliding Window Attention (SWA) ratio, integrating three SWA layers for each full-attention layer. This method significantly reduces computational overhead compared to traditional long-context models.
Local Deployment and Accessibility
Designed for easy local deployment, Step 3.5 Flash can run securely on high-end consumer hardware, such as Mac Studio M4 Max and NVIDIA DGX Spark, ensuring data privacy without compromising performance.
📖 Read the full source: HN AI Agents
👀 See Also

Anthropic restricts Claude subscription use with third-party harnesses including OpenClaw
Anthropic announced that starting April 4 at 12pm PT/8pm BST, Claude subscription limits can no longer be used with third-party harnesses like OpenClaw. Users will need to enable extra usage with separate pay-as-you-go billing for these integrations.

OpenClaw v3.22 Update Causes Dashboard and WhatsApp Issues
OpenClaw v3.22 has broken dashboard functionality and WhatsApp integration, with two GitHub issues (#52808 and #52813) documenting the problems. Users are advised not to update to this version.

Claude Security public beta: scans codebase, validates own findings, proposes patches
Anthropic launched Claude Security in public beta for Enterprise customers. It reasons through code like a security researcher, challenges its own findings via adversarial self-verification, and proposes concrete patches.

AWS Bedrock Silently Kills Claude Opus 4.7 Quota: A Warning for Production AI Workflows
An HN user reports AWS Bedrock set their Claude Opus 4.7 quota to 0 without warning. AWS support confirms it was a system update and cannot guarantee restoration. Users are advised to migrate to Opus 4.6 or switch providers.