OmniRecall Beta: FAISS-Powered Memory Injection for Cloud LLM Chats

✍️ OpenClawRadar📅 Published: March 16, 2026🔗 Source
OmniRecall Beta: FAISS-Powered Memory Injection for Cloud LLM Chats
Ad

What OmniRecall Does

OmniRecall is a local mitmproxy bypass that intercepts traffic to cloud chat interfaces (tested on DeepSeek). It hacks into the proprietary SSE fragment stream and forces a long-term memory layer onto a system that was designed to be stateless.

Technical Mechanism

  • Deep-Packet Parsing: Reconstructs the full assistant reply by tracking real-time patches
  • Command Control: Detects [ADD], [UPDATE], [REMOVE], [CLEAR] from the AI's output
  • Local Brain: Maintains memory.txt + FAISS index (sentence-transformers MiniLM-L6)
  • Context Injection: Top recalled facts get force-fed into your next message as [RECALL: ...]

Current Status & Limitations

This is a beta/experimental release. The developer notes: "This is the closest I've gotten to the dream after weeks of debugging hell. It is buggy. It is experimental. [ADD] is mostly stable, but [SEARCH] is temperamental—if you want perfection, fix it yourself. I've hit my energy limit on this build."

Upstream UI changes will break it. The developer states: "If it breaks, that's on you now."

Ad

Requirements & Setup

Potato-PC Requirements:

  • CPU only (faiss-cpu + all-MiniLM-L6-v2)
  • No local LLM needed — augments the cloud models you already use
  • Zero cost, zero API keys, 100% local data isolation

How to Deploy:

pip install mitmproxy faiss-cpu sentence-transformers numpy

Trust the mitmproxy CA cert on your OS/browser (run mitmproxy once to generate it). Set system proxy to 127.0.0.1:8080. Then run:

mitmdump -s omnirecall.py

Go to chat.deepseek.com and start feeding it memories.

License Terms

The project uses an aggressively restrictive source-available license:

  • No commercial use
  • No private forks
  • Mandatory public ALTERATIONS.md for any logic changes
  • If you port to Claude/GPT-4o/whatever, keep it public per the license

The developer explains: "I've watched too many solo-dev projects get strip-mined, privatized, or turned into paid SaaS while the creator gets zero. This license isn't friendly—it's built to protect the work from exactly those people. If the terms scare you off, that's the point."

📖 Read the full source: r/LocalLLaMA

Ad

👀 See Also

Lightpanda: Open-source headless browser for LLM agents with native MCP server and markdown output
Tools

Lightpanda: Open-source headless browser for LLM agents with native MCP server and markdown output

Lightpanda is an open-source headless browser designed for LLM-powered agents that uses 16x less memory than Chrome (215MB vs 2GB) and completes web crawling benchmarks in 5 seconds instead of 47 seconds. It provides native markdown output, semantic tree with interactivity detection, and a built-in MCP server.

OpenClawRadar
DoomVLM: Open Source Tool for Testing Vision Language Models in Doom Deathmatches
Tools

DoomVLM: Open Source Tool for Testing Vision Language Models in Doom Deathmatches

DoomVLM is now open source as a single Jupyter notebook that lets you test vision language models playing Doom via OpenAI-compatible APIs. The tool supports deathmatch modes where up to 4 models can compete, with full configuration options for system prompts, tool descriptions, and sampling parameters.

OpenClawRadar
Meta Ads MCP OAuth Works But Most Ad Accounts Not Enabled Yet
Tools

Meta Ads MCP OAuth Works But Most Ad Accounts Not Enabled Yet

Meta Ads MCP OAuth flow works and loads 29 tools, but ads_get_ad_accounts returns is_ads_mcp_enabled: false with a message that the feature is gradually rolling out.

OpenClawRadar
Running OpenClaw and Codex CLI Natively on Android via AnyClaw APK
Tools

Running OpenClaw and Codex CLI Natively on Android via AnyClaw APK

A developer has packaged OpenClaw and Codex CLI into an Android APK called AnyClaw, enabling the gateway and Control UI to run locally on ARM64 Android 7.0+ devices without root. The project required building dependencies from source and patching multiple components to handle Android-specific constraints.

OpenClawRadar