PromptForest: Local-First Prompt Injection Detection with Uncertainty

PromptForest is a new local-first library created to tackle the issues commonly seen with current prompt injection detectors. It aims to detect prompt injections and jailbreaks efficiently and with a measure of uncertainty to avoid overconfidence in results. This approach differentiates it from traditional systems, particularly by maintaining performance while still providing more nuanced outputs.
Key Details
One of the fundamental issues with existing injection detectors is the reliance on large models like Llama 2 8B and Qualifire Sentinel 0.6B. These models are not only slow, but their overconfidence in results can lead to false positives that undermine their trustworthiness in production scenarios. Recognizing these limitations, PromptForest leverages a voting ensemble method comprising three smaller, specialized models:
- Llama Prompt Guard (86M): Offers the highest pre-ensemble Expected Calibration Error (ECE) in its weight class.
- Vijil Dome (ModernBERT): Delivers the highest accuracy per parameter.
- Custom XGBoost: Trained on embeddings for architectural diversity.
These models collectively use a weighted soft voting method to determine results, where more accurate models have greater influence. This method simplifies decision-making while maintaining high accuracy and consistency.
Benchmarking shows that PromptForest performs with a mean latency of ~141ms, compared to ~225ms for the Qualifire Sentinel v2, while delivering a comparable accuracy of 90% against their 97%. Calibration ECE also fares well at 0.070 versus Sentinel's 0.096. Throughput is impressive as well, with approximately 27 prompts processed per second on a consumer GPU using the pfranger CLI.
For testing and implementation, developers can experiment with PromptForest on Google Colab or audit prompts with the PFRanger tool, which works entirely locally. PFRanger utilizes parallelization to enhance speed and throughput.
📖 Read the full source: r/LocalLLaMA
👀 See Also

Auto Router vs Sonnet: Cost Savings vs Response Quality
Open Router's Auto Router feature dynamically selects LLMs based on context complexity, offering significant cost savings (0.8 cents vs 0.00071 cents per request), but users report degraded response quality compared to Sonnet 4.6.

OpenClawDreams: A Dream Simulator Extension for OpenClaw Agents
OpenClawDreams is an extension that adds a background reflection process and nightly dream cycle to OpenClaw agents. It captures encrypted conversation summaries to a local SQLite database, processes them during background cycles, and generates consolidated insights that get pushed into the agent's persistent memory.

Integrating Local LLM Agents with ComfyUI for Natural Language Batch Image Generation
A developer shares how they wired their local OpenClaw agent to ComfyUI, enabling natural language commands for batch image generation workflows. The integration uses a custom agent skill that maps English requests to ComfyUI workflow JSON and handles API communication.

Unveiling OpenClaw: How It Empowers AI Coding Agents
Discover how OpenClaw is transforming AI coding agents, driving automation across various domains.