Claude 4.6 Adaptive Thinking: Reddit User Reports Token Waste and Provides Disable Commands

Adaptive Thinking in Claude 4.6
Claude 4.6 models introduce adaptive thinking, where the model self-calibrates how much reasoning to invest based on task complexity. Simple tasks get fast responses, while complex tasks trigger deeper thinking.
Reported Issues in Claude Code
According to the Reddit post, in Claude Code, extended thinking fires between every tool call. For iterative coding workflows like quick edits, lint fixes, and short back-and-forth exchanges, the added latency is noticeable. The user reports sometimes seeing it burn tens of thousands of expensive output tokens just thinking, often in thought loops without making useful progress.
Shell Commands to Control Adaptive Thinking
The source provides these bash commands for your shell profile:
export MAX_THINKING_TOKENS=$((1024*3))
export CLAUDE_CODE_DISABLE_ADAPTIVE_THINKING=1MAX_THINKING_TOKENS caps the reasoning budget to 3072 tokens per turn. CLAUDE_CODE_DISABLE_ADAPTIVE_THINKING stops Claude Code from triggering extended thinking automatically.
The user states that from their experience, adaptive thinking feels like a waste of session limits, and 2-8k of thinking tokens is usually sufficient. They invite discussion from users who find adaptive thinking useful for accomplishing tasks that couldn't be done without it.
📖 Read the full source: r/ClaudeAI
👀 See Also

OpenAI's Sam Altman Supports Anthropic's Pentagon Red Lines, Proposes Technical Safeguards
OpenAI CEO Sam Altman has expressed support for Anthropic's ethical stance against Pentagon AI use for mass surveillance and autonomous weapons, while proposing technical safeguards like cloud-only deployment as a resolution.

Claude's Analysis of the Minimax Debate and Anthropic's Market Gap
Claude argues that MiniMax legally obtained training data by paying for millions of API calls and identifies a gap in Anthropic's product lineup for a cheap persistent orchestrator.

Vibe Coding vs. Production Reality: The Undiscussed Liabilities
Reddit user External_Bobcat8183 highlights the gap between fast PoCs with vibe coding and real production issues: auth, secrets, GDPR, rate limiting, multi-tenancy.

Benchmark shows smaller 4B model outperforms larger LLMs for phone-to-home chat applications
A benchmark of 8 local LLMs for phone-to-home chat applications found Gemma3:4B won with a composite fitness score of 88.7 despite being the smallest model, outperforming larger models up to 24B parameters due to faster response times and lower thermal load.