Fix Ollama Cloud maxTokens: Real Cap is 16,384

PSA for anyone seeing unexpected EOF from agents on production turns: if your openclaw.json has cloud model entries like { "id": "deepseek-v4-pro:cloud", "maxTokens": 500000 }, that maxTokens isn't real. Ollama cloud caps output at 16,384 tokens server-side regardless of your config. When an agent tries to emit something past that, the upstream kills the socket mid-stream and you see a transport error from ollama.com:443. OpenClaw treats that as a timeout-shaped failover, so it'll try your fallback if configured — but if the fallback is also a :cloud model, same wall.

What Helped

Fix maxTokens on cloud entries so OpenClaw doesn't ask for output budgets the service won't honor:
{ "id": "deepseek-v4-pro:cloud", "maxTokens": 14000 }
{ "id": "kimi-k2.6:cloud", "maxTokens": 14000 }
14k not 16k — leaves a little headroom because models sometimes get weird right at the absolute cap.
Restructure large structured outputs (long JSON, multi-section content) to emit one section per turn instead of batching everything. Stays under the cap and retries are cleaner.
Route heavy agents to a direct provider via per-agent model override in agents.list[] instead of going through :cloud. Leave small-output agents on Ollama cloud. One-time setup:
openclaw onboard --auth-choice deepseek-api-key
Then in agents.list override the ones that need it:
"list": [ { "id": "your-agent", "model": "deepseek/deepseek-v4-pro" } ]
Trade-off: per-token billing instead of flat fee, but scoped to agents that need headroom.

Takeaway

If your agents fail partway through long outputs and you've checked the obvious stuff, look at your provider's actual output cap before going down the OpenClaw-bug rabbit hole. The error message is useless and the config field doesn't tell you it's being overridden server side.

📖 Read the full source: r/openclaw

Fix Ollama Cloud Model maxTokens: Cap is 16K, Not Config Value

What Helped

Takeaway

👀 See Also

Model Routing Cut API Costs by 85% vs Claude Max Subscription – A Developer's Analysis

Day 1 Configuration: Prevent 90% of Common OpenClaw Problems

Practical Strategies to Avoid Claude Rate Limits on $200 Max Plan

Using project narratives to manage memory in large OpenClaw projects