GGUF Model Merging Script and Workflow for Qwen3.5-35B Variants

✍️ OpenClawRadar📅 Published: April 1, 2026🔗 Source

A Reddit user has shared a Python script and workflow for merging GGUF model files with minimal loss, specifically targeting Qwen3.5-35B variants. The approach combines two existing models: HauhauCS's Qwen3.5-35B-A3B-Uncensored-HauhauCS-Aggressive and samuelcardillo's Qwen3.5-35B-A3B-Claude-4.6-Opus-Reasoning-Distilled-GGUF.

Technical Details

The merged model is available as a Q4_0 quantized version at Hugging Face. According to the source, samuelcardillo's finetune outperforms Jackrong's version for Qwen 3.5 35B.

Merging Workflow

The Python script (available on Pastebin) was "vibecoded via Claude Opus 4.6" and supports:

Merging GGUF files on Google Colab Free Tier
Quantization via llama-quantize
Q4_K_M quantization for 35B models
Q8 quantization for 8B models

The author notes they can't create Q8_0 or F16 quantized versions due to disk space limitations on Google Colab Free tier, but suggests others can tweak the script via Claude Opus for those quantizations.

Optimal Settings

For best performance in LM Studio, use these parameters:

Temperature: 0.7
Top K Sampling: 20
Presence Penalty: 1.5
Top P Sampling: 0.8
Min P Sampling: 0
Seed: 3407 or 42

The system prompt (full version on Pastebin) should include this first line: "You are Qwen, created by Alibaba Cloud. You are a helpful assistant." The author notes the model underperforms without this line.

📖 Read the full source: r/LocalLLaMA

👀 See Also

Tools

Reverse-engineering UniFi inform protocol for multi-tenant routing

The UniFi inform protocol sends device data to controllers via HTTP POST on port 8080 every 10 seconds. The first 40 bytes of each packet contain unencrypted device MAC addresses, enabling routing without decryption.

Mar 9, 2026, 05:45 PM UTC

OpenClawRadar

Tools

onWatch: Open-source local API quota tracker with SQLite storage

onWatch is a local-first API quota tracker that stores all data in a local SQLite database with no cloud service, telemetry, or account creation. It's a single binary (~13MB) that runs as a background daemon using <50MB RAM and serves a dashboard on localhost.

Apr 13, 2026, 08:00 AM UTC

OpenClawRadar

Tools

HolyCode: Docker Container for Persistent Claude AI Coding Environments

HolyCode is a Docker container that maintains AI coding environment state across machine switches and rebuilds. It includes 30+ preinstalled tools, browser automation with Chromium + xvfb + Playwright, and preserves context in ./data/opencode.

Apr 14, 2026, 12:45 AM UTC

OpenClawRadar

Tools

Claude-First Analytics MCP Server: Giving AI Agents Direct Access to Web Analytics Context

A developer rebuilt their web analytics tool as an MCP server, exposing simple web analytics, trackable links, and product insight tools directly to Claude, enabling AI agents to leverage site data alongside code and database context.

May 15, 2026, 12:18 PM UTC

OpenClawRadar