Loading Every MCP Server on Every Prompt Quietly Destroys Token Budget

A post on r/ClaudeAI reports a subtle but costly issue: when multiple MCP servers are configured, every prompt loads all of them by default, even trivial queries. The user had 5–6 servers and didn't notice until checking token usage—prompts were burning tokens on loading irrelevant server definitions every single time.
Key Details
- Every prompt loaded the full set of MCP servers (5–6 servers).
- Even simple prompts (e.g., "What time is it?") triggered all server definitions.
- Solution: a custom routing layer that selects only the servers relevant to the prompt.
- Result: token usage dropped significantly, and prompt response times improved.
- The OP admitted they "cannot believe they let it go on that long without checking."
Technical Context
MCP (Model Context Protocol) servers are tools that extend Claude's capabilities (e.g., file system access, database queries, web scraping). The default behavior in many setups—including forked clients and manual configs—is to send the entire list of server definitions with each message. This means tools for DB access, file I/O, web browsing, etc. are all dumped into the context window before the actual user input is processed.
A routing layer can inspect the user's message (or system prompt) and conditionally include only the MCP servers whose descriptions or tools match the intent. For example, a prompt mentioning a file path would activate file tools; a question about stock prices would load only the finance server. This avoids the token overhead of irrelevant server metadata.
Who This Is For
Developers running Claude with multiple MCP servers, especially in automated pipelines or custom frontends where token efficiency matters.
📖 Read the full source: r/ClaudeAI
👀 See Also

Claude Code /insights command provides debugging and autonomous task tips
A Reddit user shares two practical techniques for using Claude Code's /insights command: asking for at least three potential root causes when debugging bugs, and using comprehensive task specifications with --dangerously-skip-permissions for autonomous runs.

Run Claude Code in VSCode/Cursor Integrated Terminal for Better Workflow
Running Claude Code in the VSCode or Cursor integrated terminal instead of an external terminal provides immediate access to git diff panels and debuggers without switching windows, with no configuration required.

Don't Just Paste the AI — Write Your Own Take
A direct plea to developers: stop copying AI chatbot answers verbatim. Use AI as a drafting partner, then rewrite the reply in your own words.

How Claude Project Instructions Are Injected — And Why Changing Them Mid-Conversation Breaks History
Project Instructions and User Preferences are loaded into the system prompt at conversation start, not re-injected every turn. Changing them mid-conversation causes Claude to overwrite its memory of past instructions, leading to false recollections.