AI Subroutines: Deterministic Browser Automation with Zero Token Cost

What AI Subroutines Do
AI Subroutines record browser tasks once and save them as callable tools that replay at zero token cost, zero LLM inference delay, and with 100% determinism. The generated script executes inside the webpage itself, not through a proxy, headless worker, or out-of-process solution.
Key Architectural Decision
The script executes inside the webpage's execution context, which means all authentication, CSRF tokens, TLS sessions, and signed headers get added to requests automatically. No certificate installation, TLS fingerprint modification, or separate auth stack maintenance is required.
Recording Mechanism
During recording, the extension intercepts network requests using two layers:
- MAIN-world fetch/XHR patch installed before any page script runs
- Chrome's webRequest API as a correlated fallback for CORS and service-worker paths
Request bodies including FormData, Blob, and raw bytes are captured, not just JSON.
Network Capture Processing
The system scores and trims approximately 300 requests down to about 5 based on multiple signals:
- First-party vs. third-party origin (+20 / −15)
- Known telemetry hosts (Sentry, Segment, Hotjar, RUM): −80
- Temporal correlation to DOM events (+28 within 800ms, +16 within 2.5s)
- Method and payload shape (mutating POST/PUT/PATCH/DELETE: +35; GET: +5; with request body: +8)
- Response quality (2xx: +12; 4xx+: −25; non-empty body: +4)
- Volatile operation identifiers (−18) for GraphQL queryId, doc_id, operationHash
Volatile GraphQL operation IDs trigger a DOM-only fallback before they break silently on the next run.
Generated Code Structure
The generated code combines network calls with DOM actions (click, type, find) in the same function via an rtrvr.* helper namespace. The top five ranked requests plus DOM interactions get rendered into a 12,000-character context for the generator.
Usage Pattern
Point an AI agent at a spreadsheet of 500 rows, and with just one LLM call, parameters are assigned and 500 Subroutines are kicked off.
Key Use Cases
- Record sending an Instagram DM, then have a reusable routine to send DMs at zero token cost
- Create a routine to get latest products in a site catalog, call it to get thousands of products via direct GraphQL queries
- Set up a routine to file EHR forms based on parameters, with AI inferring parameters from current page context
- Reuse routines daily to sync outbound messages on LinkedIn/Slack/Gmail to a CRM using an MCP server
Why This Matters
The fundamental problem with browser agents for repetitive tasks is that going through the inference loop is unnecessary. Recording once and having the LLM generate a script that leverages all possible interaction methods (direct API calls, DOM interactions, third-party tools/APIs/MCP servers) provides deterministic, cost-effective automation.
📖 Read the full source: HN LLM Tools
👀 See Also

Reasoning Guard: Proxy-Level Loop Detection for Local LLM Inference
A proxy-layer guard that detects and recovers from LLM reasoning loops using deterministic stream checks — token caps, n-gram repetition, and sentence fingerprinting — without model modifications.

BaseLayer: Open-Source Behavioral Compression Pipeline for AI Memory Systems
BaseLayer is an open-source pipeline that extracts beliefs, behaviors, tensions, and contradictions from conversations, journals, and published text, compressing them into an identity brief for AI models. It has been tested on datasets ranging from 8 personal journal entries to large corpora like Warren Buffett's shareholder letters (350k words) and Howard Marks' investment memos (600k words).

Claude Counter: Android app tracks Claude usage limits with real-time notifications
A developer built Claude Counter, a free Android app that polls Claude's API to display live session and weekly usage limits. The app shows progress bars, provides rich notifications with percentage remaining, and alerts when limits reset.

Claude Code as a Compiler: A Practical Reframe for AI Development
A Reddit post argues Claude Code functions as a compiler translating English to working software, drawing parallels to historical computing breakthroughs like Grace Hopper's A-0 and FORTRAN. The author describes generating 400 lines across 6 files from a 3-paragraph English description, catching two issues in 25 minutes.