Sense: Go SDK for LLM-powered test assertions and structured text extraction

What Sense does
Sense is a Go SDK that leverages Claude for two primary use cases: evaluating non-deterministic output in tests and extracting structured data from unstructured text.
Key features
1. LLM-powered test assertions:
- Write expectations in plain English instead of rigid assertions
- Get structured feedback on failures including what passed, what failed, why, with evidence and confidence scores
- Example usage:
s.Assert(t, agentOutput).Expect("produces valid Go code").Expect("handles errors idiomatically").Run()
2. Structured text extraction:
- Extract typed structs from unstructured text
- Define a struct, pass a pointer, and schema is generated via reflection
- Schema enforcement happens server-side through Claude's forced tool_use
- Example usage:
var m MountError s.Extract("device /dev/sdf already mounted with vol-0abc123", &m).Run() fmt.Println(m.Device) // "/dev/sdf" - Useful for log parsing, support tickets, and API normalization beyond just testing
Additional functionality
Evalfor programmatic resultsComparefor A/B testing- Batching support with 50% cost savings
EvaluatorandExtractorinterfaces for mocking- Includes 135+ tests
Development context
The entire SDK was built using Claude Code, from initial design through implementation, tests, and documentation. The creator is seeking feedback on API design and what would make this useful for developer workflows.
📖 Read the full source: r/ClaudeAI
👀 See Also

Local Qwen Models Achieve Browser Automation with Stepwise Planning and Compact DOM
A developer found small local LLMs like Qwen 8B and 4B succeed at browser automation using stepwise planning instead of upfront multi-step plans, combined with a compact semantic DOM representation that reduces token usage from 50-100K+ to ~15K for full flows.

Developer Tests Apple Intelligence for On-Device Clipboard Tasks
A developer built a clipboard manager using Apple Intelligence's Foundation Models framework, finding it reasonable for everyday tasks like short summaries and rewrites but limited on ambiguous language and detailed work.

UK Sovereign LLM Inference: Relax.ai Launches Public Docs
Relax.ai released docs for UK sovereign LLM inference, redirecting to /docs/getting-started/introduction. The service was shared on HN with 104 points.

PinchBench Results: First OpenClaw-Specific AI Coding Agent Benchmark
The first OpenClaw-specific benchmark, PinchBench, ranks 32 AI models by success rate, cost, and speed, with Google's Gemini-3-Flash-Preview leading at 95.1% success for $0.72.