Building an Autonomous Research Agent with C# and Local LLMs

Here's a look at a new autonomous research agent built in C# utilizing local LLMs, specifically Ollama along with the llama3.1:8b model. The agent automates the process of URL handling by generating search queries, conducting web searches through the Brave Search API, and extracting relevant data, all culminating in a structured markdown report.
Key Details
- The agent accepts a topic input, for instance, "persistent memory for AI agents".
- It autonomously formulates 5-8 search queries.
- Searches are executed via the Brave Search API, and the top sources are fetched and analyzed.
- The agent reads through 8-12 sources and extracts 5-8 key findings.
- All data processing is executed locally using the Ollama (llama3.1:8b) model without reliance on OpenAI/Anthropic APIs.
- The output is a markdown report complete with citations.
Performance and Architecture
The current setup runs on a Ryzen 5 5500, CPU-only, with 16GB RAM, taking approximately 15 minutes per research cycle. The developer notes that 3B models, such as llama3.2, are inadequate for tool calling, thereby making 8B a necessary minimum for reliable performance.
Key challenges include the need for findings truncation before synthesis to prevent the model from stalling on lengthy contexts, and occasional malformed tool calls even with 8B models, resolved by retrying with altered prompts. The agent utilizes SQLite paired with embeddings for managing memory at a personal scale, obviating the need for a vector database.
Technology Stack
- C# / .NET 8
- Ollama
- SQLite
- Brave Search API (free tier)
For developers interested in building their own agent, there's a starter kit and an 8-chapter guide available on the project's GitHub repository, provided under the MIT license, along with the full source code: hex-dynamics.
📖 Read the full source: r/LocalLLaMA
👀 See Also

Orloj: Declarative Orchestration Runtime for Multi-Agent AI Systems
Orloj v0.1.0 is an open-source orchestration runtime that lets you define AI agents, tools, policies, and workflows in YAML manifests with GitOps. It handles scheduling, execution, governance, and reliability for production multi-agent systems.

Pleng: Self-Hosted Cloud Platform with AI-Driven Infrastructure Management
Pleng is an AGPL-3.0 licensed, self-hosted cloud platform that uses an AI agent (currently Claude) to manage infrastructure via Telegram bot commands. It deploys from GitHub repos or local directories with automated Traefik routing, Let's Encrypt SSL, and basic analytics.

Codebook Lossless LLM Compression: 10-25% RAM Reduction with Bitwise Packing
A developer's proof-of-concept code demonstrates lossless LLM compression by packing fp16 weights into blocks, achieving 10-25% RAM reduction with a trade-off of approximately halved inference speed. The approach identifies that most models only use 12-13 bits of unique values despite fp16's 16-bit representation.

Terminal-Based 3D Renderer Built with Multi-Agent Claude Code System
A developer created tortuise, a pure terminal-based 3D renderer that displays Gaussian splats using Unicode and ASCII symbols, built over 3 days using 70-80 AI agents coordinated through a Claude Code setup with subagents inside subagents.