7 MCP Gateway Bugs: Session Leaks, Dead SSE, and OAuth in Gateway Mode

After the happy path demos, a Reddit user hit seven specific bugs when putting an MCP gateway between real clients and servers. The fixes were not prompt engineering — they were explicit session boundaries, per-tool timeouts, idempotency, structured action logs, gateway-level traces, and tests against concurrent tool calls. The result was a large reduction in parallel tool wall time, but the bigger win was knowing where failure lived.
The seven bugs that actually mattered
- Session state leaking across clients — shared state between sessions caused data contamination.
- SSE connections dying silently — no error surfaced when a server-sent event connection dropped.
- OAuth flows working in local tests but breaking in gateway mode — redirect URIs or token validation failed behind the proxy.
- Discovery probes returning stale server metadata — cached capabilities didn't reflect server updates.
- SQLite writes blocking parallel tool calls — database locks serialized concurrent requests.
- Retry logic duplicating tool side effects — retries re-executed mutations like writes or API calls.
- Tool latency hiding inside the gateway instead of the model call — monitoring attributed time to the wrong layer.
The fix: boring infra, not better prompts
The author's approach to each bug:
- Explicit session boundaries — separate state per client, no shared objects.
- Per-tool timeout policy — individual timeouts to prevent one slow tool holding up others.
- Idempotency where possible — deduplication keys or transactional behavior to make retries safe.
- Structured action logs — detailed, parseable logs of every gateway action for debugging.
- Gateway-level traces — distributed tracing to attribute latency correctly across layers.
- Tests against concurrent tool calls — integration tests that fire parallel requests to surface race conditions.
These are specific, practical patterns for anyone running an MCP gateway in production. The post's key insight: the hard problems are state isolation, silent failures, and observability — not model prompts.
📖 Read the full source: r/ClaudeAI
👀 See Also

Cost-Effective OpenClaw Automation: Using LLMs Only When Needed
A developer shares a practical approach to using OpenClaw for deterministic tasks without constant LLM calls, creating Python scripts for cron jobs and only invoking the LLM when errors require analysis and fixes.

How to Set Up an AI-Powered Morning Briefing

Fixing Claude's Time Hallucinations in Claude Code with Hooks
A user discovered that Claude Code lacks real-time clock access, causing it to incorrectly suggest actions like 'get some rest' at inappropriate times. The fix involves adding a one-line hook to ~/.claude/settings.json that injects the current time into Claude's context on every message.

Custom PostToolUse Hook for On-Demand CLAUDE.md Loading Outside Project Tree
A developer shares a custom PostToolUse hook solution that enables Claude Code to read CLAUDE.md files from directories outside the current project tree on-demand, addressing limitations in the built-in loading behavior.