Claude vs GPT-4o: Double Pendulum Coordinate Conventions

A Reddit user ran the same double pendulum prompt through Claude and GPT-4o side by side using a shared host renderer and saw two completely different physical systems within seconds. The cause: each model chose a different convention for measuring theta.

Claude measured theta from the up vertical (theta=0 = arm pointing straight up), while GPT-4o measured from the down vertical (theta=0 = arm hanging straight down). The host renderer in public/workers/simulator-host.js simply reads info.theta1 and info.theta2 and draws the arms accordingly — no cosmetic differences. So the visual mismatch is a real physics mismatch.

Both conventions are technically valid. Most classical mechanics textbooks use theta from the down vertical because it makes the equilibrium point at theta=0 for small-angle approximations. But theta from the up vertical is also standard in many references. Claude committed to its convention consistently across equations of motion, initial conditions, and integration (Runge Kutta). GPT-4o used the other convention silently — it did not comment on its choice.

The user was working on Physics Bench, an open-source side-by-side benchmark where every model gets the same generation contract: function createSimulator(...) in lib/prompt.ts. The host owns all rendering; models only implement step, getInfo, and reset. Models never touch draw. So any visual difference between panels is guaranteed to come from a real difference in simulation logic, not rendering choices.

A unit test of the math would not have caught this. Both models produce correct physics for their chosen conventions. You only see the split when rendering them next to each other through the same drawing code. This underlines the importance of specifying coordinate conventions explicitly in prompts when the output is consumed by a fixed renderer.

See the full Reddit thread for code snippets and conversation inspector details.

📖 Read the full source: r/ClaudeAI

Claude vs GPT-4o: Same Double Pendulum Prompt, Different Coordinate Conventions

👀 See Also

Codestrap founders critique AI coding metrics and warn of quality issues

Subquadratic Debuts 12M Token Context Window for AI Models

Developer's experience with Claude AI: From thinking partner to cognitive outsourcing

When AI Defends Its Own Mistakes: A Compound Failure Mode