OpenClaw LLM Timeout Fix for Cold Model Loading

Issue: Cold Model Timeouts at 60 Seconds
Users reported that cold-loaded local models in OpenClaw would consistently fail after approximately 60 seconds, despite having the general agent timeout set much higher. This issue also occurred with cloud models via Ollama and sometimes OpenAI Codex.
The typical failure pattern:
- Models work if already warm
- Cold models die around ~60 seconds
- Logs mention timeout / embedded failover / status: 408
- Fallback model takes over
Misleading Configurations
The source warns that several obvious configuration options are NOT the real fix and can send developers down the wrong path:
agents.defaults.timeoutSeconds.zshrcexportsLLM_REQUEST_TIMEOUT- Blaming LM Studio / Ollama immediately
Root Cause
The issue stems from OpenClaw having a separate embedded-runner LLM idle timeout for the period before the model emits the first streamed token.
Source trace found in:
src/agents/pi-embedded-runner/run/llm-idle-timeout.ts
Default value:
DEFAULT_LLM_IDLE_TIMEOUT_MS = 60_000
The configuration path resolves from:
cfg?.agents?.defaults?.llm?.idleTimeoutSeconds
So the actual configuration parameter is:
agents.defaults.llm.idleTimeoutSeconds
The Fix
After testing, the working configuration is:
{
"agents": {
"defaults": {
"llm": {
"idleTimeoutSeconds": 180
}
}
}
}
Testing showed that a cold Gemma call that previously failed around 60 seconds survived past that threshold and eventually responded successfully without immediate failover.
Recommended Permanent Configuration
{
"agents": {
"defaults": {
"timeoutSeconds": 300,
"llm": {
"idleTimeoutSeconds": 300
}
}
}
}
The recommendation of 300 seconds accounts for local models being unpredictable, where false failovers are more problematic than waiting longer for genuinely cold models.
📖 Read the full source: r/openclaw
👀 See Also

How to Cut OpenClaw Agent Costs by 80% with Model Switching
A user tracked token usage for 14 days and found 67% of spend was on tasks where cheap Flash models matched Opus quality. Switching to Flash by default and using /model mid-session cut costs from ~$170 to ~$35/month.

OpenClaw Agents Become Unresponsive After Week 1: Telegram Integration Issues?
User reports OpenClaw agents going silent after the first week, suspecting Telegram integration or long-term runtime issues. Restarts help temporarily.

OpenClaw v2026.3.13 adds per-agent cacheRetention config for OpenAI token cost savings
OpenClaw v2026.3.13 adds per-agent cacheRetention configuration that enables OpenAI's 24-hour prompt cache retention, potentially cutting input token costs by up to 90% for agents with heartbeat cycles longer than 10 minutes.

Claude Stealth Mode Directive for Autonomous AI Execution
A Reddit user shares a 'stealth mode' directive that forces Claude to operate silently and autonomously, delivering complete one-shot results without conversation output until work is complete.