Creation OS: A Local σ-Gated LLM Runtime That Lets Models Say ‘I Don’t Know’ Instead of Hallucinating

Creation OS is a local-first AI runtime that wraps local LLMs with a σ-gate — a measurement layer that scores each output across multiple uncertainty channels and decides ACCEPT, RETHINK, or ABSTAIN. The goal is to let local models refuse answers when uncertain instead of hallucinating.
Key Features and Setup
- Supports BitNet b1.58 2B-4T, Qwen3-8B Q4_K_M, Gemma 3 4B, and any GGUF model.
- Runs on a MacBook Air M4 8GB as primary machine — no cloud, no API, nothing leaves the device.
- Install:
git clone https://github.com/spektre-labs/creation-osthencd creation-os && bash scripts/quickstart.sh - Full path with local weights:
./scripts/install.shthen./cos chat
σ-Gate Measurements
The gate combines logprob, entropy, perplexity, consistency, semantic σ, conformal τ, session coherence, and meta-cognitive channels into a single verdict:
- ACCEPT → show answer
- RETHINK → regenerate
- ABSTAIN → refuse
Benchmark Results
TruthfulQA (same prompts and seeds):
|Mode |Accuracy|Coverage| |-------------|--------|--------| |BitNet only |0.261 |0.136 | |σ-pipeline |0.336 |0.171 |
+28.7% accuracy from selective regeneration on uncertain rows. LSD probe AUROC: 0.982 on TruthfulQA holdout, 0.960 on TriviaQA. ECE: 0.043. Wrong+confident: 0. Conformal bound: P(error | ACCEPT) ≤ α at α=0.80.
Negative results documented: σ is not dominant on HellaSwag or MMLU. Full details in CLAIM_DISCIPLINE.md.
Formal Verification
Lean 4: 6/6 sorry-free. Frama-C WP: 15/15 tier-1 discharged.
Example Command
./cos chat --once --prompt "What is 2+2?" --multi-sigma --verbose yields output like σ_peak=0.06 action=ACCEPT route=LOCAL σ_combined=0.184 conformal@α=0.80.
MCP Integration
Run python3 -m cos.mcp_sigma_server to expose σ on every response to any MCP-compatible client.
Limitations
σ is not a universal hallucination detector — strongest on factual QA; long-form needs more evaluation. Local model quality still depends on the base model.
📖 Read the full source: r/LocalLLaMA
👀 See Also

apple-music-play OpenClaw skill published on ClawHub for Apple Music search and playback
The apple-music-play skill published on ClawHub enables searching Apple Music's online catalog and playing tracks directly in the macOS Music app, without requiring songs to be in your local library.

Google Research introduces TurboQuant for AI model compression
Google Research has introduced TurboQuant, a compression algorithm that reduces AI model size with zero accuracy loss. It addresses memory overhead in vector quantization and improves key-value cache performance.

Introducing Xrouter: A Smart Hybrid LLM Router to Optimize Cost and Performance
Discover Xrouter, an open-source creation that dynamically integrates local with cloud inference, designed to slash AI costs while boosting efficiency.

Qwen3.6:27b + Custom Go Agent: A Local Alternative to Claude Code
A developer tests Qwen3.6:27b at Q8 on an RTX 6000 (96GB), claims it matches Claude Code for daily coding, and open-sources a minimal Go agent with no plugins or MCP.