KnightClaw: Local Security Extension for OpenClaw Agents

✍️ OpenClawRadar📅 Published: February 23, 2026🔗 Source

KnightClaw is a security extension designed to protect OpenClaw AI coding agents from adversarial prompts. The tool addresses a specific threat model where a single malicious message in the context window can cause an agent to follow attacker instructions instead of user commands.

Core Features

KnightClaw operates as a drop-in extension with no configuration required, no API keys, and no cloud dependency. It intercepts every message before it reaches the agent.

Detection System

The guard uses an 8-layer hybrid detection approach:

Regex patterns
Homoglyph detection
Boundary token analysis
Perplexity scoring
Entropy analysis
Heuristics
Semantic embeddings (using a local, quantized BGE model)

Blocks occur in microseconds.

Additional Security Measures

Egress redaction: Strips secrets from outbound responses before they leave the agent
Hash-chained audit logs: Tamper-proof, append-only logs with full timeline of every block, allow, and config change
Velocity circuit breaker: 10 blocks in 60 seconds triggers automatic lockdown with no manual intervention
Kill switch: One command stops everything: openclaw knight lockdown on

Technical Details

The extension runs entirely local with zero telemetry and is MIT licensed. The source is available for testing and contribution.

📖 Read the full source: r/openclaw

👀 See Also

Security

OpenClaw security patches fix QR code credential exposure and plugin auto-load vulnerabilities

OpenClaw released two security patches addressing critical vulnerabilities: QR codes embedded permanent gateway credentials without expiry, and plugins auto-loaded from cloned repos without user confirmation. Version 2026.3.12 fixes both issues.

Mar 13, 2026, 08:45 PM UTC

OpenClawRadar

Security

Using Claude to audit OpenClaw setup reveals security issues

A developer used Claude to review their OpenClaw installation and discovered the bot was writing API keys in clear text in memory and JSON files, along with other security concerns.

Apr 20, 2026, 11:45 AM UTC

OpenClawRadar

Security

Claude Android App Reportedly Reads Clipboard Without Explicit User Action

A user reports that the Claude Android app analyzed code from their clipboard without them pasting it, with Claude identifying the file as pasted_text_b4a56202-3d12-43c8-aa31-a39367a9a354.txt. The behavior couldn't be reproduced in subsequent tests.

Mar 8, 2026, 01:45 PM UTC

OpenClawRadar

Security

Skill Analyzer Now Available on ClawHub with One-Command Install

The OpenClaw Skill Analyzer security scanner is now available on ClawHub with a single command install. The tool scans skill folders for malicious patterns like prompt injection and credential theft, and includes Docker sandbox support for safe execution.

Mar 27, 2026, 10:45 PM UTC

OpenClawRadar