Qwen 3.5 Chat Template Release with 21 Bug Fixes for Agent Workflows

A developer has released a patched chat template for Qwen 3.5 models, fixing 21 bugs encountered during agentic workflows. This is a drop-in replacement for the official template, requiring only a swap of the chat_template.jinja file.
Key Fixes
The developer specifically ran Qwen 3.5 35B for agentic workflows and addressed the following major issues:
- Tool Calling Crash: Fixed a crash related to
arguments | items(referenced as HF discussion #4). - Tool/Think Block Leak:
<tool_call>content no longer leaks into<think>blocks, with auto-disable thinking when tools are active. - Parallel Tool Calls: Calls are now properly separated with
\n\ndelimiters. - Deep Agent Loops: Prevents crashes after 5+ tool hops.
- Unknown Role Handling: Roles like 'planner' and 'critic' now gracefully fall back instead of causing a crash.
- Streaming Parsers: Provides clean XML boundaries for streaming.
- Configurable Truncation: Allows setting a maximum character limit for large tool arguments and responses.
- Developer Role Support: Adds support for roles like 'Claude Code', 'Codex', and 'OpenCode'.
A full list of all 21 fixes is available in the project's README.
Configuration
The template includes configurable variables. They can be set via command-line arguments:
--chat-template-kwargs '{"enable_thinking":true,"auto_disable_thinking_with_tools":true,"max_tool_response_chars":8192}'
Compatibility & Testing
The template has been tested on the following platforms with the specified minimum versions:
- llama.cpp (b4242+)
- Open WebUI (v0.4.8+)
- vLLM (v0.6.4+)
- Ollama (v0.5.0+)
- LM Studio (v0.3.5+)
- Text Generation WebUI
It is compatible with all Qwen 3.5 models (35B, 27B, 14B, 9B, 4B, and the Coder series) and is backward-compatible with Qwen3 32B.
Source and License
The template is available for download on HuggingFace at barubary/qwen3.5-barubary-attuned-chat-template. It is released under the Apache 2.0 license, and the developer welcomes feedback and bug reports.
📖 Read the full source: r/LocalLLaMA
👀 See Also

Claude Plugins: Computer Vision, Multi-Agent Council, and Self-Debugging Workflow
Three Claude plugins were released: Computer Vision v1.7.0 for Windows app automation, The Council v3.1.0 for adversarial multi-agent consultation, and Upwork Scraper v0.2.0 for job market analysis. A demonstration showed Claude using these plugins to diagnose and fix its own Solitaire automation bug.

vllm-mlx fork adds tool calling and prompt cache for local AI coding agents
A developer has modified vllm-mlx to fix tool calling issues and add prompt caching, reducing TTFT from 28s to 0.3s for OpenClaw on Apple Silicon. The fork supports Qwen3-Coder-Next at 65 tok/s on M3 Ultra with working function calling.

nah: A context-aware permission guard for Claude Code
nah is a PreToolUse hook that intercepts every tool call in Claude Code, classifying commands by action type like filesystem_read or git_history_rewrite and applying policies based on context. It runs a deterministic classifier in milliseconds with optional LLM escalation for ambiguous cases.

Synapse: Real-Time Dashboard for Visualizing Claude Code Agent Sessions
Synapse is a real-time dashboard that visualizes Claude Code agent sessions as interactive node graphs, showing agent spawns, tool calls, and subagents. It requires Node.js and Claude, installs via npm, and offers multiple analysis views and remote approval features.