Aura Research: Local tool compiles documents into AI-navigable wiki with persistent memory

Aura Research is an open-source tool that compiles raw documents into an AI-navigable wiki with persistent memory. The tool runs 100% locally with no data leaving your machine.
How it works
The workflow consists of four main commands:
pip install aura-research
research init my-project
# copy docs into raw/
research ingest raw/
research compile
research query "your question"You drop a folder of raw documents (PDFs, papers, notes, code, supporting 60+ formats) and the LLM compiles them into a structured markdown wiki with backlinked articles, concept pages, and a master index. It then compresses everything into a .aura archive optimized for RAG retrieval, which the developer claims is approximately 97% smaller than raw source data.
Key design decisions
- No embeddings, no vector databases. Uses SimHash + Bloom Filters instead with zero RAM overhead
- Built-in 3-tier Memory OS (facts / episodic / scratch pad) so the LLM doesn't forget important context across sessions
- The wiki is just .md files, browsable in Obsidian, VS Code, or any markdown editor
- Works with any LLM provider (OpenAI, Anthropic, Gemini) or as an agent-native tool inside Claude Code/Gemini CLI where no API key is needed
- Everything runs locally with no data leaving your machine
The "no embeddings" approach
The developer deliberately avoided the standard RAG pipeline (chunk → embed → vector search). Instead, the LLM compiles knowledge into a well-structured wiki with an index. When you query, it reads the index, finds the 2-3 relevant articles, and only loads those. The approach assumes that if knowledge is properly organized, the LLM is smart enough to navigate a good file structure without needing a separate embedding model.
The tool is available on GitHub at https://github.com/Rtalabs-ai/aura-research and can be installed via PyPI with pip install aura-research.
📖 Read the full source: r/LocalLLaMA
👀 See Also

Mia: Local AI Workspace Daemon with Native Android App and P2P Streaming
Mia is a daemon that runs on your machine and pairs with a native Android app over P2P, allowing you to kick off and monitor long-running AI coding tasks from your phone. It supports OpenCode, Claude Code, Gemini CLI, and Codex agents, streaming output directly to your device in real time.

Qwen 3.6 27B F16 Passes Pacman Coding Test, But 8-Bit Quants Fail — Key Lessons on Templates and MTP Speculative Decoding
A user one-shots a Pacman clone with Qwen 3.6 27B F16 — two of three attempts produce nearly perfect games. 8-bit quants fail entirely. Detailed notes on chat template tuning and MTP speculative decoding speed gains.

Claude-ETA Plugin Adds Task Timing and Repair Loop Detection to Claude Code
Claude-ETA is a Claude Code plugin that times tasks, learns your actual velocity, and feeds real data back into Claude before it responds. It also detects repair loops by fingerprinting error content and intervenes after three identical failures.

Claude Workflow Library Now Tracks and Rates Reddit- Sourced Workflows Automatically
A searchable, auto-updated index of Claude and Claude Code workflows from major subreddits, with steps, artifacts, and community ratings.