Tool Use in Language Agents

What It Is

Tool use in language agents is the ability to invoke external functions — web search, code execution, calculators, APIs — to retrieve information or take actions that can’t be derived from model weights alone. The model decides when to call a tool, what arguments to pass, and how to incorporate the result.

Why It Matters

Without tools, LLMs are bounded by training data (stale knowledge) and reasoning errors (no ground truth check). With tools, agents can fetch real-time information, execute and verify code, and handle tasks that require state changes in external systems. Tool use is the difference between a language model and an agent.

How It Works

The standard pattern: model produces a structured tool call (function name + arguments) → execution environment runs the tool → result appended to context → model continues. This repeats until the model produces a final answer or a stop signal.

The calibration problem: Naive models call tools reflexively — every question triggers a search, every calculation gets code-executed. This is expensive and slow. The useful capability is selective tool use: calling only when the query genuinely requires it.

Training for selectivity: Reinforcement learning with a combined accuracy + efficiency reward. HDPO (Metis) showed that mixing these rewards before advantage normalization is mathematically broken — efficiency signal gets swamped. Fix: compute advantages independently, combine at loss level.

Key Sources

react-reasoning-and-acting — ReAct (Yao et al., 2022): the foundational pattern of interleaving reasoning thoughts with tool actions; demonstrated dramatically better performance than pure CoT or pure acting
toolformer-language-models-teach-themselves-tool-use — Toolformer (Schick et al., 2023): self-supervised training of tool-calling behavior; the model learns when tools help via perplexity-based filtering
metis-hdpo-meta-cognitive-tool-use — HDPO trains agents to reduce tool call rate from 98% to 2% while improving accuracy
neural-computers
codeact-executable-code-actions-llm-agents
qwen2-5-vl-technical-report

Open Questions

When is tool abstention the right choice vs. tool invocation?
How to train for tool quality (not just count)?
Multi-tool coordination: search → code → search chains

ML Wiki

Explorer

Tool Use in Language Agents

What It Is

Why It Matters

How It Works

Key Sources

Open Questions

Graph View

Table of Contents

Backlinks

ML Wiki

Explorer

Tool Use in Language Agents

What It Is

Why It Matters

How It Works

Key Sources

Related Concepts

Open Questions

Graph View

Table of Contents

Backlinks