Exploiting LLM Hidden Agency Signal (Â) for Better Tool Calling

✍️ OpenClawRadar📅 Published: March 8, 2026🔗 Source
Exploiting LLM Hidden Agency Signal (Â) for Better Tool Calling
Ad

While debugging ReAct agent failures with Qwen3, a developer discovered that hidden states right before tool calls are linearly separable from non-tool states with AUC > 0.94. This direction in latent space, called  (for "agency"), exists across model sizes from 1.7B to 8B and predicts tool calls using just a linear probe.

How to Use the Agency Signal

During inference, project each hidden state onto Â. If the projection exceeds a threshold θ, the model wants to call a tool even if it doesn't express it textually. You can then force a tool call.

# At inference time (pseudo-code)
hidden_state = get_middle_layer_state(model, input_text)
proj = np.dot(hidden_state, Â)
if proj > threshold:
    # Model wants to act → force tool call
    tool = choose_tool() # can be learned or heuristic
    result = execute_tool(tool)
else:
    # Normal generation
    output = model.generate(input_text)

Performance Results

Tested on 40 diverse tasks (search, code, file, comm, data) with Qwen3 models:

  • Qwen3-1.7B: 26.7% → 85% (+58% gain)
  • Qwen3-8B: 52.5% → 76.3% (+23% gain)

The "no-tool" failure mode dropped from 43% to 2.6%. Smaller models benefit more because their textual decoding is weaker, but the geometric signal is equally strong.

Ad

How to Extract Â

Three methods:

  • Option 1: From your own traces - Calculate the normalized mean difference between tool and non-tool hidden states
  • Option 2: Via contrastive prompts - Run 15 pairs of prompts (one requiring a tool, one passive) through your model and take the mean difference at the middle layer
  • Option 3: Use pre-computed directions - Use the  directions extracted for Qwen3 models shared in the repository

Packaged Implementation

The discovery has been packaged into a library for easy reuse:

bash
pip install a-hat-optimizer
python
from a_hat_optimizer import AHat

# Auto-extract from any HF model in 1 line
ahat = AHat.from_model("Qwen/Qwen3-8B")

# Or load pre-extracted
ahat = AHat.from_file("my_ahat_dir/")

# Use in your agent
should_call, confidence = ahat.predict(hidden_state)
if should_call:
    print(f"Force tool call (confidence: {confidence:.2f})")

The library handles auto-extraction via contrastive prompts, 4 calibration strategies (midpoint, F1, Youden, percentile), batch prediction, and save/load with metadata including AUC and layer information.

📖 Read the full source: r/LocalLLaMA

Ad

👀 See Also