OpenClaw Users Report Model Replacements After Anthropic Ban

Community Leaderboard and Model Preferences
According to community votes tracked at pricepertoken.com/leaderboards/openclaw as of April 5 (net = upvotes minus downvotes):
- Kimi K2.5 -- +49 net (54 up, 5 down). $0.38/M input tokens. Native OpenClaw support added. 13x cheaper than Opus.
- GLM 4.7 -- +20 net (25 up, 5 down). $0.39/M input. Budget cloud pick.
- Gemini 3 Flash Preview -- +18 net (21 up, 3 down). Not Pro — the flash variant. People choosing speed + cost over raw quality.
- Claude Opus 4.5 -- +18 net but 32 up / 14 down. Most votes total but also the most controversial. People split on paying API rates for what used to be included.
- Claude Opus 4.6 -- +17 net (19 up, 2 down).
Most-Adopted Replacement: GPT-5.x
OpenAI officially added GPT-5.4 support to OpenClaw and offers 1M free tokens/day on the data-sharing tier. Codex subscriptions explicitly allow third-party tool usage.
Matthew Berman's full stack is now: GPT 5.3 Codex XH for coding, GPT 5.2 for default, GPT 5 Mini for classifiers. Claude as backup only.
Local Model Options
r/LocalLLaMA consensus includes:
- Qwen 3.5 27B -- top pick. 72.4% SWE-bench (matches GPT-5 mini). Zen van Riel running the 35B version at 100-140 tok/s on an RTX 5090.
- Qwen3-Coder:32B -- "extremely stable tool calling"
- Llama 4 -- default for broad-purpose local deployments
- Devstral-24B -- recommended primary with GLM-4.7 flash as fallback
Ollama became an official OpenClaw provider in March. One person on X is running Qwen 3.5 + OpenClaw + Ollama completely free.
Hybrid Setups and Cost Considerations
The people who seem happiest aren't all-in on one model. They're running tiered stacks:
- Expensive model (Claude/GPT) for complex reasoning
- Cheap model (DeepSeek at $0.14/M, Kimi, GLM) for routine ops
One user built a Kubernetes gateway across 5 Raspberry Pis routing Claude, GPT, Gemini, and DeepSeek behind one API.
@0xzak shared a specific config: DeepSeek for routine ($0.14/$1.10) vs Sonnet for complex ($3/$15), contextTokens at 120k not 150k.
Workarounds and Migration Tools
OpenClaw bypassed the OAuth ban by piping through the local Claude CLI binary instead of OAuth tokens. Pete himself recommended this method.
Oh-My-Codex (OmX) — a workflow/orchestration layer for OpenAI's Codex CLI gained 13K stars in the same week as the ban.
Impact and Cost Changes
60% of active OpenClaw sessions were reportedly running on subscription credits before the ban.
The cost jump is brutal. Heavy users went from ~$200/month flat to ~$675/month at API rates. Some automated sessions hitting $1,000-$5,000/day.
Anthropic is offering a one-time credit + 30% on prepaid bundles but the sentiment everywhere is that it's too little.
📖 Read the full source: r/openclaw
👀 See Also

Qwen KV Cache Quantization Deep Dive: PPL, KL Divergence, and Asymmetric K/V Results
Second round of benchmarks on Qwen 3.6-35B-A3B with KV cache quantization: perplexity, KL divergence, asymmetric K/V combos, and 64K context depth on Apple M5 Max.

GPU Power Consumption Deviates from Token Predictor Theory in Small LLMs
An experiment testing the 'stochastic parrot' theory on four 8B-parameter models found GPU power consumption often scales non-linearly with token count, with divergence rates ranging from 7.7% to 36.7%. The study also revealed persistent residual heat after philosophical queries and order-dependent effects.

Sakana AI Launches RSI Lab: Recursive Self-Improvement with Foundation Models
Sakana AI formally launches its Recursive Self-Improvement Lab, building on published research like LLM-Squared, Darwin Gödel Machine, and The AI Scientist to create autonomous, self-improving AI systems.

Claude Code adds voice input with push-to-talk functionality
Claude Code is rolling out voice mode to approximately 5% of users initially, featuring push-to-talk activation by holding spacebar. Voice transcription tokens don't count against rate limits and the feature is included at no extra cost.