Integrating Local LLM Agents with ComfyUI for Natural Language Batch Image Generation

A developer on r/LocalLLaMA shared their integration between a local OpenClaw agent and ComfyUI that enables natural language batch image generation. The setup allows users to describe image requests in plain English, with the agent handling the entire ComfyUI pipeline without manual UI interaction.
How the Integration Works
The flow follows this sequence:
- Agent receives image request
- Parses intent into structured inputs (prompt, dimensions, steps, seed)
- Calls comfyui skill as a tool
- Skill builds a ComfyUI workflow JSON from inputs
- POSTs to local ComfyUI HTTP API (/prompt)
- Polls /history every 2 seconds until render completes
- Retrieves output path from /view
- Returns result to agent
- Agent confirms with user
Technical Implementation Details
The integration uses ComfyUI's node-ID-based JSON workflow format. The skill maps agent inputs onto specific node IDs in a base workflow template (KSampler, CLIPTextEncode, etc.). This is described as "the most fragile part of the integration since it depends on your workflow's node structure, but for standard setups it works reliably."
The skill includes startup verification by pinging /object_info to ensure ComfyUI is actually ready (not just reachable) before accepting jobs. This prevents jobs from queuing without running when checkpoints are still loading.
Error Handling Improvements
Every API call is wrapped to return agent-readable errors instead of raw HTTP failures. For example, "Connection refused at 127.0.0.1:8188" becomes "ComfyUI doesn't seem to be running. Start it with --listen and try again." This makes debugging easier, especially when working remotely.
Current Limitations
The integration doesn't yet support:
- Advanced multi-node workflows (ControlNet, LoRA stacking)
- Real-time progress streaming via WebSocket
- Cross-platform testing beyond Windows
The entire stack runs locally using OpenClaw (self-hosted agent framework) + ComfyUI + a Node.js skill script, with no cloud components.
📖 Read the full source: r/LocalLLaMA
👀 See Also

GSD-Lite: A State Machine for Claude Code That Enforces TDD and Prevents Test Skipping
GSD-Lite is an open-source MCP server that adds a 12-state workflow machine to Claude Code, enforcing test-driven development with specific anti-rationalization prompts and separate agent contexts for execution, review, and debugging.

Developer Builds Tool for Realistic Relational Database Generation
A developer built a tool that generates fully loaded relational databases with realistic data, solving the problem of creating test databases with intact foreign key relationships and cross-table consistency.

Free MCP Lets Claude Analyze Google Search Console Data Automatically
A free MCP (Model Context Protocol) server lets Claude directly query Google Search Console data for any site you have access to. Ask about queries, pages, clicks, impressions, CTR, and position without manual CSV exports.

Running Two Claude Code Agents on the Same Repo with Git Worktrees
A Reddit user details how to run multiple Claude Code agents in parallel on the same codebase using git worktrees, avoiding file conflicts and enabling independent agent sessions.