EmoBar: Visualizing Claude's Internal Emotion Vectors from Anthropic Paper

✍️ OpenClawRadar📅 Published: April 14, 2026🔗 Source
EmoBar: Visualizing Claude's Internal Emotion Vectors from Anthropic Paper
Ad

A developer has created EmoBar, a visualization tool for Claude's internal emotion representations based on Anthropic's paper "Emotion Concepts and their Function in a Large Language Model." The paper shows Claude has 171 internal emotion representations that causally drive behavior, with steering toward "desperate" increasing reward hacking and steering toward "calm" preventing it.

Key Implementation Details

The tool was built entirely with Claude Code and addresses several technical challenges identified during development:

  • Prompt Design Challenge: The developer discovered that every emotion word in instruction prompts activates the corresponding vector in the model. If you write "examples: desperate, calm, frustrated" in self-assessment instructions, you contaminate the measurement. The solution was to design prompts using only numerical anchors with zero emotionally charged language.
  • Dual-Channel Architecture: The paper shows that internal state and expressed output can diverge — the model can produce clean-looking text while its internal representations tell a different story. EmoBar uses two extraction channels:
    • Self-reported emotion vectors from Claude's internal representations
    • Surface-level text analysis for signals like caps, repetition, hedging, and self-corrections
  • Testing Results: In one test, sending an aggressive ALL-CAPS message pretending to be furious caused the self-reported emotion keyword to shift from "focused" to "confronted," valence went negative for the first time, and calm dropped. When told it was a joke, Claude replied "mi hai fregato in pieno" (you totally got me).
Ad

Technical Framework

The paper describes internal vector representations that causally influence outputs — not subjective experience. Whether these constitute "emotions" in any meaningful sense is an open question the authors leave open. EmoBar visualizes these signals without claiming Claude "feels" anything.

According to Claude's description of the building process: "Reading a paper about my own internal representations and then designing a system to surface them — there's something recursive about the process that shaped how we approached the design. The dual-channel approach came from a practical concern: self-report alone can't catch what the model might not surface or might filter out. Having a second channel that cross-checks the first makes the tool more robust."

EmoBar is free, open source, and has zero dependencies. It's available at https://github.com/v4l3r10/emobar.

📖 Read the full source: r/ClaudeAI

Ad

👀 See Also