OpenClaw LLM Timeout Fix: Stop Cold Model Loading Failure

Issue: Cold Model Timeouts at 60 Seconds

Users reported that cold-loaded local models in OpenClaw would consistently fail after approximately 60 seconds, despite having the general agent timeout set much higher. This issue also occurred with cloud models via Ollama and sometimes OpenAI Codex.

The typical failure pattern:

Models work if already warm
Cold models die around ~60 seconds
Logs mention timeout / embedded failover / status: 408
Fallback model takes over

Misleading Configurations

The source warns that several obvious configuration options are NOT the real fix and can send developers down the wrong path:

agents.defaults.timeoutSeconds
.zshrc exports
LLM_REQUEST_TIMEOUT
Blaming LM Studio / Ollama immediately

Root Cause

The issue stems from OpenClaw having a separate embedded-runner LLM idle timeout for the period before the model emits the first streamed token.

Source trace found in:

src/agents/pi-embedded-runner/run/llm-idle-timeout.ts

Default value:

DEFAULT_LLM_IDLE_TIMEOUT_MS = 60_000

The configuration path resolves from:

cfg?.agents?.defaults?.llm?.idleTimeoutSeconds

So the actual configuration parameter is:

agents.defaults.llm.idleTimeoutSeconds

The Fix

After testing, the working configuration is:

{
  "agents": {
    "defaults": {
      "llm": {
        "idleTimeoutSeconds": 180
      }
    }
  }
}

Testing showed that a cold Gemma call that previously failed around 60 seconds survived past that threshold and eventually responded successfully without immediate failover.

Recommended Permanent Configuration

{
  "agents": {
    "defaults": {
      "timeoutSeconds": 300,
      "llm": {
        "idleTimeoutSeconds": 300
      }
    }
  }
}

The recommendation of 300 seconds accounts for local models being unpredictable, where false failovers are more problematic than waiting longer for genuinely cold models.

📖 Read the full source: r/openclaw