Testing MiniMax M2.7 via API on Three Real ML and Coding Workflows

Andrey Lukyanenko put MiniMax M2.7 through three realistic ML and coding workflows via the API, using Claude Code as the harness. The goal: see how M2.7 performs in agentic loops compared to Claude Opus 4.7.
Setup
The test environment wrapped the MiniMax API into a claude-mm command that points Claude Code at M2.7:
claude-mm () {
ANTHROPIC_BASE_URL = "https://api.minimax.io/anthropic" \
ANTHROPIC_AUTH_TOKEN = "$MINIMAX_API_KEY" \
ANTHROPIC_MODEL = "MiniMax-M2.7" \
ANTHROPIC_DEFAULT_SONNET_MODEL = "MiniMax-M2.7" \
ANTHROPIC_DEFAULT_OPUS_MODEL = "MiniMax-M2.7" \
ANTHROPIC_DEFAULT_HAIKU_MODEL = "MiniMax-M2.7" \
ANTHROPIC_SMALL_FAST_MODEL = "MiniMax-M2.7" \
API_TIMEOUT_MS = "3000000" \
CLAUDE_CODE_DISABLE_NONESSENTIAL_TRAFFIC = "1" \
claude "$@"
}He ran on MiniMax’s Plus tier ($40/month), where the context window and per-day throughput were sufficient for multi-step agentic work.
Workflow 1: Refactoring a PyTorch Project
The task was to update dependencies and code quality in the pytorch_tempest repo (Hydra + PyTorch Lightning). Changes included:
- Updated CI versions and pre-commit hooks.
- Replaced black + flake8 with ruff for linting and formatting.
- Enabled
fsdp_sharding_strategyin the Lightning trainer config. - Refreshed documentation.
- Added uv for environment management.
- Switched to modern Python typing (
list[X]overList[X],X | NoneoverOptional[X]). - Removed duplicate code paths.
The approach was step-by-step: Lukyanenko gave explicit requirements, reviewed each change, and provided feedback when the diff went off scope. M2.7 fit this well because it stayed within narrow prompts and allowed line-level review. CI failures were fixed iteratively with the agent’s help.
Workflow 2: Obsidian Vault Notes
For writing and auditing ML reference notes in Obsidian, Lukyanenko tuned prompts specifically for M2.7. He started by asking both M2.7 and Opus 4.7 to generate notes from the same prompt, then had M2.7 read both outputs and propose an improved prompt for itself. The resulting prompt (condensed) was:
Fill one broken-link stub in the DSWoK vault: research the topic, draft the note in DSWoK voice, run draft-critic-mm, save to the right folder.
Steps: read style guide, pick a stub, grep for cross-references, choose destination folder, draft, then critique.
Key Findings
Across all three runs, M2.7 was useful when constraints were explicit and output format was concrete. It struggled when important context was left implicit, though Opus 4.7 sometimes had the same gaps. For open-ended cases, a human review pass is still recommended. The author notes that model quality and harness design are hard to separate — a stronger model may infer missing constraints, while a better harness makes them explicit.
📖 Read the full source: HN AI Agents
👀 See Also

Spectral: Capture App Traffic to Generate MCP Servers for OpenClaw Agents
Spectral is an open-source tool that captures traffic from any application, analyzes it with an LLM, and generates a working MCP server, allowing OpenClaw agents to call the app's real API directly instead of relying on browser automation.

NPCterm: Full PTY Terminal Emulator for AI Agents via MCP
NPCterm provides AI agents with full terminal access through a headless, in-memory PTY terminal emulator exposed via MCP. It includes 15 MCP tools for terminal control, process state detection, and support for TUI applications.

Tendr Skill: Deterministic CLI Operations for Agent Memory Management
Tendr Skill is an Agent Skill that separates reasoning from execution for structured long-term memory, allowing agents to decide what needs changing while a CLI tool handles structural operations deterministically. It supports [[wikilinks]] and explicit semantic hierarchies across files.

Artifactr: Local-first CLI tool for managing AI coding agent artifacts
Artifactr is a free, open-source CLI tool for managing LLM artifacts like skills, commands, and agent definitions. It stores files in portable vaults with no network connections and supports automatic syncing via symlinks.