Open-Source GLiGuard: 300M Safety Model 16x Faster

Fastino Labs has open-sourced GLiGuard, a safety moderation model that replaces generative guardrails with a classification approach. The 300M parameter encoder handles four moderation tasks in one forward pass, achieving accuracy comparable to 7B–27B parameter decoder models while reducing latency by up to 16x. Weights are available under Apache 2.0 on Hugging Face, with inference also available on Pioneer.

Why decoder-based guardrails are slow

Current state-of-the-art guardrails (e.g., Llama Guard) use decoder-only transformers that generate verdicts token by token. This sequential generation makes them slow and expensive for real-time safety filtering. Most also evaluate safety dimensions separately, compounding latency. At 7B to 27B parameters, these models are costly to run at production scale.

GLiGuard's encoder approach

GLiGuard reframes moderation as text classification. It encodes both input text and task labels together, scoring all labels simultaneously in a single pass. Adding more safety dimensions (labels) does not add inference time. The model handles four concurrent tasks:

Safety classification — safe / unsafe for both user prompts and model responses
Jailbreak strategy detection — 11 categories (prompt injection, roleplay bypass, instruction override, social engineering, etc.)
Harm category detection — 14 categories (violence, sexual content, hate speech, PII, misinformation, child safety, copyright violation, etc.)
Refusal detection — compliance or refusal, used to measure over-refusal and false compliance

All four are evaluated together, where decoder models would require sequential passes or multiple model calls.

Benchmarks and performance

Across nine safety benchmarks, GLiGuard matches or exceeds models 23–90x its size while running up to 16x faster. No specific accuracy numbers are given in the post, but performance is claimed to be comparable to leading generative guardrails.

Who it's for

Teams deploying LLM agents or chat systems that need low-latency, cost-effective real-time safety filtering at scale.

📖 Read the full source: HN AI Agents

GLiGuard: Open-Source 300M Parameter Safety Moderation Model Claims 16x Speedup Over LLM Guardrails

Why decoder-based guardrails are slow

GLiGuard's encoder approach

Benchmarks and performance

Who it's for

👀 See Also

Slides-grab: Visual Editor for Fixing HTML Slides Generated by Claude Code

Fullerenes: Open-source persistent memory layer for coding agents cuts tokens by 64% on SWE-bench

WordPress.com MCP Integration Adds Write Capabilities for Claude

Running NemoClaw with Local vLLM: Setup Notes and Agent Engineering Observations