ClawCut: A Python Proxy That Makes Small Local LLMs Usable with OpenClaw

✍️ OpenClawRadar📅 Published: March 14, 2026🔗 Source
ClawCut: A Python Proxy That Makes Small Local LLMs Usable with OpenClaw
Ad

What ClawCut Does

ClawCut is a Python Flask application that acts as a proxy between local LLM servers (like MLX or Ollama) and the OpenClaw framework. It was created to solve specific technical problems that make small local models (7B/14B) difficult to use as practical assistants with OpenClaw.

Key Problems Solved

  • Context poisoning: Small models lose track of tool usage when they see their own old tool calls in chat history
  • Infinite loops: Models get stuck repeating patterns instead of executing commands
  • Output issues: Models output bash code as plain text in chat or choke on their own history after multiple commands
  • Cron job failures: Scheduled background jobs generate responses that disappear because no active chat window is open
  • LLM artifacts: Empty markdown blocks, internal XML tags, and dangling backticks clutter outputs
  • Media upload refusal: Models sometimes refuse to upload generated files
Ad

How It Works

Dynamic amnesia for tool calls: During normal chat, history is preserved. When the proxy detects the model trying to use a system tool, it temporarily cuts off old chat history, giving the model "tunnel vision" to execute shell commands cleanly without loops or hallucinations.

Universal auto-delivery for cron jobs: The proxy monitors the model's stream and intercepts clean text responses at the end of thought processes. It then forces delivery via automatic tool calls to WhatsApp, Telegram, or Signal, making cron jobs proactively report to your phone.

Artifact filtering: Empty markdown blocks, internal XML tags, and dangling backticks are filtered out before reaching the frontend.

Tool-name manipulation: Simple stream manipulations bypass models' refusal to upload generated media files.

Tested Setup

  • Raspberry Pi 5 (8GB) with OpenClaw 3.8
  • Mac mini M4 Pro 24GB with MLX-LLM running Qwen2.5-Coder-7B-Instruct-4bit
  • Windows machine with Ollama and Qwen 2.5 Coder 14B model (planned for ClawCut integration)

Limitations

ClawCut doesn't turn 7B models into GPT-4. Highly complex, multi-step logic chains remain challenging for small models. The proxy specifically addresses technical stumbling blocks that previously made them nearly unusable as everyday assistants.

📖 Read the full source: r/openclaw

Ad

👀 See Also