KnightClaw: Local Security Extension for OpenClaw Agents

KnightClaw is a security extension designed to protect OpenClaw AI coding agents from adversarial prompts. The tool addresses a specific threat model where a single malicious message in the context window can cause an agent to follow attacker instructions instead of user commands.
Core Features
KnightClaw operates as a drop-in extension with no configuration required, no API keys, and no cloud dependency. It intercepts every message before it reaches the agent.
Detection System
The guard uses an 8-layer hybrid detection approach:
- Regex patterns
- Homoglyph detection
- Boundary token analysis
- Perplexity scoring
- Entropy analysis
- Heuristics
- Semantic embeddings (using a local, quantized BGE model)
Blocks occur in microseconds.
Additional Security Measures
- Egress redaction: Strips secrets from outbound responses before they leave the agent
- Hash-chained audit logs: Tamper-proof, append-only logs with full timeline of every block, allow, and config change
- Velocity circuit breaker: 10 blocks in 60 seconds triggers automatic lockdown with no manual intervention
- Kill switch: One command stops everything:
openclaw knight lockdown on
Technical Details
The extension runs entirely local with zero telemetry and is MIT licensed. The source is available for testing and contribution.
📖 Read the full source: r/openclaw
👀 See Also

OpenClaw security patches fix QR code credential exposure and plugin auto-load vulnerabilities
OpenClaw released two security patches addressing critical vulnerabilities: QR codes embedded permanent gateway credentials without expiry, and plugins auto-loaded from cloned repos without user confirmation. Version 2026.3.12 fixes both issues.

Using Claude to audit OpenClaw setup reveals security issues
A developer used Claude to review their OpenClaw installation and discovered the bot was writing API keys in clear text in memory and JSON files, along with other security concerns.

Claude Android App Reportedly Reads Clipboard Without Explicit User Action
A user reports that the Claude Android app analyzed code from their clipboard without them pasting it, with Claude identifying the file as pasted_text_b4a56202-3d12-43c8-aa31-a39367a9a354.txt. The behavior couldn't be reproduced in subsequent tests.

Skill Analyzer Now Available on ClawHub with One-Command Install
The OpenClaw Skill Analyzer security scanner is now available on ClawHub with a single command install. The tool scans skill folders for malicious patterns like prompt injection and credential theft, and includes Docker sandbox support for safe execution.