Off Grid Mobile App Adds On-Device AI Tool Use with 3x Speed Improvement

Off Grid, an on-device AI mobile app, has been updated to add tool use capabilities and significant performance improvements. The app now allows AI models to call tools offline without requiring API keys, servers, or cloud functions.
Key Features and Performance
The update introduces automatic tool loops for web search, calculator, date/time functions, and device information access. According to the developer, this bridges the gap between "local toy" and "useful assistant" by enabling 3B parameter models to reason, call tools, and synthesize results directly on your phone.
Performance improvements come from configurable KV cache options. Users can now choose between three KV cache types:
f16q8_0q4_0
With q4_0 cache, models that previously generated 10 tokens/second now reach 30 tokens/second. The app includes a performance nudge feature that suggests faster settings after the first generation.
Model Support and Platform Availability
Off Grid supports GGUF format models, including:
- Qwen 3
- Llama 3.2
- Gemma 3
- Phi-4
- Other GGUF-compatible models
The app is now available on both major app stores without sideloading requirements. It can be installed directly from the App Store and Google Play.
Core Functionality and Philosophy
What hasn't changed in this update:
- MIT licensed and fully open source
- Zero data leaves the device (no analytics, telemetry, or anonymous usage data)
- Offline capabilities including text generation (15-30 tokens/second), image generation (5-10 seconds on NPU), vision AI, voice transcription, and document analysis
The developer states the project is motivated by the belief that "the phone in your pocket should be the most private computer you own — not the most surveilled."
📖 Read the full source: HN AI Agents
👀 See Also

Jobly: Contract Marketplace with AI-First Dispute Resolution and Community Voting
Jobly is a contract marketplace built with Next.js 14, TypeScript, and Supabase, featuring an escrow system with 10% provider bonds on proposals and a dispute pipeline that starts with AI evaluation using Claude, then allows appeals to community stake voting.

Interactive Mind Map Visualizes Claude Tool Ecosystem
A developer created an interactive HTML mind map using D3.js to track features across Claude's Chat, Cowork, and Code tools, including platform availability, pricing differences, and connector compatibility.

Claude Code at Scale: How Agentic Search Avoids RAG Failure Modes in Large Codebases
Claude Code uses agentic file-system traversal instead of embedding-based RAG, eliminating stale index issues. The article details five extension points (CLAUDE.md, hooks, skills, plugins, MCP) and the harness-as-model philosophy for multi-million-line repos.

Fixing Context Bloat in Claude Code Auto-Memory with a Naming Schema and Audit Script
A Claude Code skill enforces a 3-type naming schema, required frontmatter, and a bash audit script to deduplicate memory files and reduce context load.