Multi-provider LLM fallback chain with Ollama support in production AI IDE

Resonant Genesis, a production AI IDE platform, has integrated local LLM support as a first-class provider in its architecture. The platform runs across 30+ microservices and treats local models as equal to cloud providers like Groq, OpenAI, Anthropic, and Gemini.
Architecture and integration
The platform uses a shared rg_llm library called UnifiedLLMClient that's volume-mounted across all services. Every microservice that needs LLM capabilities imports this same client. The fallback chain is configured as: Groq → OpenAI → Anthropic → Gemini → Ollama/LM Studio.
The IDE's thin client extension automatically discovers local Ollama models and adds them to the provider list. Users can configure the system to prefer local models first if desired.
Server-side orchestration
All orchestration lives server-side, with the IDE acting as a thin client that renders UI, executes local tools (file operations, terminal, git), and streams results via Server-Sent Events (SSE). The agentic loop, tool selection, system prompts, and LLM routing all happen on the server.
When using a local model, it still goes through the same governed execution pipeline:
- Pre-execution policy enforcement (blocks actions before they run)
- Native function calling via provider APIs (no JSON prompt injection)
- Cryptographic identity (DSID on Ethereum L2) for every agent
- Same 59 local tools available regardless of which LLM provider you choose
Benefits for local LLM users
For users running Ollama locally, this architecture provides:
- Privacy: Thin client architecture means no company intelligence in the binary, and with local models, prompts stay local
- Tool use: 59 local tools with native function calling, not prompt-injected JSON schemas
- Fallback: If a local model can't handle a complex task, it automatically falls back to cloud providers
The developers are seeking feedback from people running local models, particularly around function calling performance with smaller models and which models work well for agentic tool use.
The project is open source at GitHub, and a guest chat demonstrating the tool ecosystem is live at dev-swat.com (uses cloud models).
📖 Read the full source: r/LocalLLaMA
👀 See Also

Pilot: A Browser Automation Tool Built Entirely with Claude Code
A non-developer used Claude Code to build Pilot, a Chrome automation tool that lets AI control browsers via accessibility tree navigation. The tool assigns numbers to clickable elements so Claude can issue commands like 'click 5' instead of guessing screen positions.

blend-ai: New Blender MCP Service for Claude Code
blend-ai is a new Blender MCP service that allows Claude Code to generate 3D scenes. A user reported it worked faster and better than blender-mcp, creating a shuttle launch scene from reference images in 5 minutes.

General Bots: Open-source AI agent platform for self-hosted enterprise automation
General Bots is an open-source platform started in 2019 that provides AI agents, workflow automation, document processing, and integrations with local AI model support, designed for organizations needing full control over their infrastructure.
Multi-Agent Memory: Open Source Shared Memory System for AI Agents
Multi-Agent Memory is an open source project that provides a shared memory system for AI agents across different machines, tools, and frameworks. It supports four distinct memory types with specific behaviors and includes features like credential scrubbing, agent isolation, and LLM consolidation.