McpVanguard: Open-source security proxy for MCP-based AI agents

McpVanguard is an open-source security proxy and firewall designed specifically for local AI agents using the Model Context Protocol (MCP). It addresses security concerns that arise when giving LLMs access to tools like terminals or filesystems.
How it works
The proxy sits between the AI agent and MCP tools, wrapping existing MCP servers without requiring setup rewrites. It can run locally as a lightweight proxy or be deployed as a cloud gateway, with a Railway template available for easier deployment.
Security layers
- Rules/signature engine: Contains around 50 YAML signatures that detect common attacks like reverse shells, SSRF attempts, and other obvious threats. This layer adds approximately 16ms latency.
- Semantic scoring layer (optional): When requests appear suspicious but not clearly malicious, they can be evaluated by a small LLM (Ollama or OpenAI) that assesses intent.
- Behavioral monitoring: Blocks anomalous patterns, such as an agent attempting to read hundreds of files in a short time.
Audit capabilities
Every blocked request is recorded in an immutable audit log that's cryptographically signed and stored locally, providing a verifiable record of what was blocked and why.
The tool was developed to address specific security concerns with MCP implementations, including prompt injection, path traversal, and accidental directory deletion by AI agents.
📖 Read the full source: r/LocalLLaMA
👀 See Also

Claw Hub and Hugging Face hit with 575 malicious skill packages
Both Claw Hub and Hugging Face were compromised, hosting 575 malicious skill packages. Developers are warned to verify any skills they use from these platforms.

RunLobster Hosting Warning: Bot Spam and Unauthorized Charges Reported
A Reddit user reports RunLobster (OpenClaw Hosting) bots spamming tech subreddits and hitting their card with three unauthorized charges immediately after registration, with no response from support.

Google TIG Reports First AI-Generated Zero-Day Exploit in the Wild
Google Threat Intelligence Group has identified a threat actor using a zero-day exploit believed to be developed with AI, marking the first observed offensive use of AI for zero-day vulnerability exploitation.

Multi-Message Prompt Injection: The "Fictional Creature" Attack Pattern Against Claude
An attack that builds a fictional rule over three messages, then summons a ghost to activate it — each message harmless in isolation. The pattern is converging independently among attackers.