What It Is
Tool use in language agents is the ability to invoke external functions — web search, code execution, calculators, APIs — to retrieve information or take actions that can’t be derived from model weights alone. The model decides when to call a tool, what arguments to pass, and how to incorporate the result.
Why It Matters
Without tools, LLMs are bounded by training data (stale knowledge) and reasoning errors (no ground truth check). With tools, agents can fetch real-time information, execute and verify code, and handle tasks that require state changes in external systems. Tool use is the difference between a language model and an agent.
How It Works
The standard pattern: model produces a structured tool call (function name + arguments) → execution environment runs the tool → result appended to context → model continues. This repeats until the model produces a final answer or a stop signal.
The calibration problem: Naive models call tools reflexively — every question triggers a search, every calculation gets code-executed. This is expensive and slow. The useful capability is selective tool use: calling only when the query genuinely requires it.
Training for selectivity: Reinforcement learning with a combined accuracy + efficiency reward. HDPO (Metis) showed that mixing these rewards before advantage normalization is mathematically broken — efficiency signal gets swamped. Fix: compute advantages independently, combine at loss level.
Key Sources
-
metis-hdpo-meta-cognitive-tool-use — HDPO trains agents to reduce tool call rate from 98% to 2% while improving accuracy
Related Concepts
Open Questions
- When is tool abstention the right choice vs. tool invocation?
- How to train for tool quality (not just count)?
- Multi-tool coordination: search → code → search chains