CC-Canary: Detect Regressions in Claude Code with Local JSONL Analysis

CC-Canary is a drift detection tool for Claude Code, packaged as two installable Agent Skills. It scans the JSONL session logs that Claude Code already writes to ~/.claude/projects/, detects whether the model has been drifting on your own work, and produces a shareable forensic report. No network, no account, no telemetry, no background daemon — runs on data already on your disk. Status: 0.x / pre-alpha.
Installation
Install via npx skills:
npx skills add delta-hq/cc-canary
Or install individual skills:
npx skills add delta-hq/cc-canary --skill cc-canary npx skills add delta-hq/cc-canary --skill cc-canary-html
Requirements: Python 3.8+ on PATH. macOS/Linux/WSL for auto-open of HTML report (falls back to printing path).
Usage
From a Claude Code session:
/cc-canary 60d /cc-canary-html 30d
The window defaults to 60 days; accepts 7d, 14d, 30d, 60d, 90d, 180d.
What You Get
- Verdict — HOLDING / SUSPECTED REGRESSION / CONFIRMED REGRESSION / INCONCLUSIVE
- Headline metrics table — pre vs post comparison with green/yellow/red bands
- Weekly trend bars — cost (USD, verified against ccusage), read:edit ratio, reasoning loops, tokens/turn
- Cross-version comparison — same user, different model versions, controlling for task mix
- Auto-detected inflection date — composite health-score break
- Findings with model-side / user-side / ambiguous classification
- Appendices — hour-of-day thinking depth, word-frequency shift, three-period thinking-visibility transition, per-turn behavior rates
Metrics Tracked
- Read:Edit ratio — file reads per edit; proxy for investigation thoroughness
- Write share of mutations — Write / (Edit + Write); high share = rewriting instead of surgical edits
- Reasoning loops / 1K tool calls — phrases like "let me try again", "oh wait", "actually"
- Frustration rate — rate of frustration words in your prompts
- Thinking redaction rate — fraction of thinking blocks redacted vs visible
- Mean thinking length — reasoning-depth proxy
- API turns per user turn — API calls per user message
- Tokens per user turn — total token volume per user message
Plus appendices for premature stopping, self-admitted errors, shortcut vocabulary, user interrupts, etc.
How It Works
- Scan — Python script (stdlib only) walks
~/.claude/projects/**/*.jsonl, filters by window, excludes subagent sessions. - Dedupe — Assistant messages deduped on (message.id, requestId) because Claude Code writes the same message into multiple JSONLs when sessions are resumed or branched.
- Aggregate — Per-session metrics: tool-mix, read:edit ratio, reasoning-loop phrases, self-admitted errors, premature stops, interrupts, token usage, cost (current Claude 4.x rates), hour-of-day thinking depth.
- Detect inflection — Composite health score per day; argmax of |before − after| over candidate dates with 0.75σ floor. Falls back to median-timestamp split if no break clears.
- Pre-render report — Script writes markdown/HTML skeleton with every table and bar chart filled in. ~20 narrative slots left for Claude to fill.
- Fill & save — Claude reads skeleton, writes narrative, saves final file. Total runtime: ~2.5s script + 10–20s Claude narrative.
📖 Read the full source: HN AI Agents
👀 See Also

Skales: Desktop AI Agent with Ollama Support, 300MB Idle RAM
Skales is a native Electron desktop app that provides an autonomous AI agent with .exe/.dmg installers, works with Ollama for local inference or cloud providers, and uses ~300MB idle RAM with data stored locally in ~/.skales-data.

Argus: A GitHub App That Reviews CLAUDE.md Files and Posts Scores on PRs
Argus is a GitHub App built with Claude Code that reviews CLAUDE.md files and posts a score on every pull request. After testing on multiple repositories, the most common failures are missing explicit scope limits and escalation paths.
Collaborate: A Claude Code Skill for Structured, Asynchronous Document Writing with Multi-Agent Handoffs
A Claude Code skill called 'collaborate' enables multi-contributor document writing where each participant gets a plain‑English briefing from Claude on previous changes, reasoning, and next tasks, with support for parallel sections, structured critique, and Slack/Signal notifications.

Setting Up OpenClaw as an Always-On AI Assistant
OpenClaw, configured as an always-on AI assistant for a small dev team, is set up on a Railway server with Claude as the backend and integrates with Google Workspace, GitHub, and more.