Two $0 OpenClaw setups using free cloud models or local Ollama

An OpenClaw user reports running an agent for free for three weeks, handling about 70% of tasks previously paid for with Claude. The setup offers two paths: free cloud models with rate limits or local models via Ollama with zero ongoing costs.
Path 1: Free cloud models (no hardware needed)
This approach requires only an existing OpenClaw installation and free API tiers:
- OpenRouter free tier: Sign up at openrouter.ai with no credit card. Offers 30+ free models including Llama 3.3 70B, Nemotron Ultra 253B (262K context), MiniMax M2.5, and Devstral. Configuration example:
{
"env": { "OPENROUTER_API_KEY": "sk-or-..." },
"agents": {
"defaults": {
"model": {
"primary": "openrouter/nvidia/nemotron-ultra-253b:free"
}
}
}
}
For automatic model selection: "primary": "openrouter/openrouter/free"
- Gemini free tier: Google provides 15 requests per minute on Gemini Flash for free. Get an API key from ai.google.dev and run
openclaw onboard, selecting Google as the built-in provider. - Groq: Fast with rate-limited free tier suitable for basic agent tasks.
The catch: rate limits. For light to moderate daily use (10-20 interactions), pauses are barely noticeable. For 100+ tasks daily, this won't work.
Path 2: Local models via Ollama (truly $0, forever)
Ollama became an official OpenClaw provider in March 2026. This setup has no API keys, accounts, rate limits, or data leaving your machine.
Setup steps:
- Install Ollama:
curl -fsSL https://ollama.com/install.sh | sh - Pull a model based on your VRAM:
- 20GB+ VRAM (RTX 3090, 4090, M4 Pro/Max):
ollama pull qwen3.5:27b - 16GB VRAM:
ollama pull qwen3.5:35b-a3b - 8GB VRAM (most laptops):
ollama pull qwen3.5:9b
- 20GB+ VRAM (RTX 3090, 4090, M4 Pro/Max):
- Run
openclaw onboardand select Ollama, or use manual setup withexport OLLAMA_API_KEY="ollama-local"
Qwen3.5 27B is noted as the current sweet spot for OpenClaw, handling tool calling well for daily agent tasks. The 35b-a3b mixture-of-experts variant runs at 112 tokens/second on an RTX 3090 by activating only 3B parameters at a time.
Manual configuration example:
{
"models": {
"providers": {
"ollama": {
"baseUrl": "http://localhost:11434",
"apiKey": "ollama-local",
"api": "ollama",
"models": [
{
"id": "qwen3.5:27b",
"name": "Qwen3.5 27B",
"reasoning": false,
"contextWindow": 131072,
"maxTokens": 8192
}
]
}
}
},
"agents": {
"defaults": {
"model": {
"primary": "ollama/qwen3.5:27b"
}
}
}
}
Important debugging notes:
- Use the native Ollama API URL (
http://localhost:11434), NOT the OpenAI compatible one (http://localhost:11434/v1). The /v1 path breaks tool calling, causing raw JSON output as plain text. - Set
"reasoning": falsein the model configuration.
📖 Read the full source: r/clawdbot
👀 See Also

Guide: Running GitHub Copilot with Local LLM on Windows via Lemonade Server
A developer created a walkthrough for setting up GitHub Copilot to work with a local LLM on a Framework Desktop using Lemonade Server, addressing the lack of simple instructions for this configuration on Windows.

OpenClaw 102: Updated Setup Tips for Security and Efficiency
A Reddit user shares updated OpenClaw configuration advice, including API key encryption with Windows PowerShell scripts, prompt injection defenses in AGENTS.md, Tailscale for remote access, and anti-loop rules to prevent repetitive failures.

Opus 4.7 Broke 40% of Prompts; Fix Was Structuring CLAUDE.md and Skills
After Opus 4.7 degraded ~40% of prompts across 6 setups, a fractional head of AI fixed it by replacing ad-hoc prompts with structured Skill files, hierarchical CLAUDE.md, and separate memory files — reducing token usage 22% and iteration turns from 3-4 to 1-2.

Optimizing AutoResearch on RTX 5090: What Failed and What Worked
A developer shares specific configuration details for running AutoResearch on an RTX 5090/Blackwell setup, including failed approaches that appeared functional but performed poorly, and the working configuration that achieved stable results with TOTAL_BATCH_SIZE=2**17 and TIME_BUDGET=1200.