Kimi K2.6 vs Claude Opus 4.7: Hands-On Test with a Minetest Bounty Board Mod

Interesting real-world comparison of two models on a weird coding task: building a Minetest/Luanti bounty board game mod with a TypeScript backend, then extending it with Google Sheets logging through Composio. Both models received the same prompts. Details from the source post.
Setup
- Claude Opus 4.7: via Claude Code
- Kimi K2.6: via OpenCode on OpenRouter
- Task: player joins world, runs
/bounty, gets task, completes it, gets reward, backend records completion. Second test: log completions to Google Sheets via Composio.
Pricing
- Opus 4.7: $5/M input, $25/M output
- Kimi K2.6: $0.95/M input, $4/M output (cached input $0.16/M)
Test 1: Local Bounty Board
Opus 4.7: Clean MVP. Express/Zod/Vitest backend, Lua mod, /bounty flow, rewards, leaderboard, tests passed. Stats:
- Cost: ~$3.59
- Time: 12min API, 23min wall
- Code: +1,688 / -0
- Output tokens: 54.8k
- Cache read: 2.8M
Kimi K2.6: Got the local board working too, but messier. Wrote 4,671 lines of code (+4,671 / -0) vs Opus's 1,688 — over 2× more code. Cost: ~$0.39. Time: ~9min 27sec. The annoying part: Minetest config. It wrote secure.http_mods = bountykimi in the global config, but created a world-level config with a different mod name, so the HTTP API was not enabled for the running mod. Took the tester 30+ minutes to debug.
Test 2: Composio + Google Sheets
Opus 4.7: Got Google Sheets sync working. After some back-and-forth on tsx watch and env loading, backend could complete a bounty and append to Sheets. Stats:
- Cost: $16.03
- Time: 28min API, 1hr 17min wall
- Code: +1,848 / -507
- Cache read: 22.3M
- Output: 123.3k tokens
Kimi K2.6: Failed. Stuck on dev server issues, tests, build problems. Never wired the Composio integration into a working state. After ~25 min and 135k+ tokens, tester stopped. Cost: ~$5.03.
Takeaway
- Best local MVP: Opus, but Kimi is way better value
- Best real integration: Opus by a lot
- Cleaner code: Opus
- Cheaper experiment model: Kimi
Testing shows Kimi K2.6 is interesting for cheaper local coding tasks — for $0.39 getting a working Lua+TypeScript mod is not bad. But once the task involved external tools, config issues, and real integration work, Opus 4.7 was clearly ahead.
Full breakdown with commits, screenshots, demos, and costs at the source link.
📖 Read the full source: r/ClaudeAI
👀 See Also

Anthropic launches Claude Community Ambassadors program
Anthropic has launched the Claude Community Ambassadors program, which provides resources for organizing local developer meetups and connecting builders worldwide. The program is open to participants from any background and location.

Kimi $19/m Update: Enhancing OpenClaw with Structured Models
Kimi introduces its latest update priced at $19/month, focusing on enhancing model structuring within OpenClaw. This update promises streamlined operations and improved automation features.

Granite 4.1: IBM's 8B Dense Model Matches 32B MoE in Benchmarks
IBM's Granite 4.1 8B dense model matches or beats the previous 32B MoE model on ArenaHard, BFCL V3, GSM8K, and more, thanks to improved training data quality.

Shenzhen's Longgang District Proposes OpenClaw Subsidies for AI Agent Startups
Longgang District in Shenzhen has released a draft policy document offering subsidies and support specifically for OpenClaw ecosystem development and OPC startups, aiming to become a global hub for AI agent entrepreneurship.