Developer shares token cost challenge with Claude-built ERP system

The problem: Single-file architecture doesn't scale with AI assistants
A developer running a small freight forwarding business built a complete ERP system using Claude. The system grew to over 3,000 lines of code in a single HTML file containing all modules: dashboard, shipment tracking, cash flow, driver logs, and customer records.
The core issue: Every time they need to make even a small change, they must load the entire 3,000+ line file into Claude's context window. This consumes approximately 60,000-80,000 tokens per message. For a solo operator, this creates both expense and inefficiency problems.
The root cause is architectural: a single-file monolith forces Claude to re-read and re-understand all 3,000 lines of mixed HTML, CSS, and JavaScript each time, even when only tweaking one small function.
Potential solutions under consideration
The developer is evaluating two approaches:
- Split the file into modules — Separate JavaScript files per feature so only necessary code loads per session
- Migrate to Firebase — This was already on their roadmap and would naturally enforce a modular architecture
They're asking the community for advice on managing large codebases with Claude or other LLMs, specifically how to structure projects to keep token costs reasonable.
📖 Read the full source: r/ClaudeAI
👀 See Also

Corporate Developer's Claude Workflow for Backend Development
A backend developer at a large US finance company shares their Claude workflow: providing detailed task descriptions with specs and internal documents, using Claude to create a working markdown document, then employing a codeReviewing agent with organizational style guidelines.

Using Claude Code to Build a Satellite Image Analysis Pipeline for Retail Predictions
A developer used Claude Code to build a complete satellite imagery analysis pipeline that pulls Sentinel-2 optical and Sentinel-1 radar data via Google Earth Engine, processes parking lot boundaries from OpenStreetMap, and calculates occupancy metrics to predict retail earnings outcomes.

Local Fine-Tuning of Llama 3.2-1B for Secret Detection Surpasses Wiz's Model
A developer replicated and improved upon Wiz's secret detection model using purely local AI, achieving 88% precision and 84.4% recall with Llama 3.2-1B. The process involved dataset augmentation with procedural generation and local labeling using Qwen3-Coder-Next.

OpenClaw and Remotion Pipeline for Automated Video Editing
A developer describes an agent-driven workflow using OpenClaw for orchestration and Remotion for rendering to automatically create 20 Reels from 400+ clips, with filtering, JSON-defined montages, and batch processing.