What It Is

Instruction following is a model’s ability to interpret a natural language prompt and produce a response that fulfills the intent — rather than just continuing the text statistically. A base pretrained model predicts the next token; an instruction-following model answers the question.

Why It Matters

Without instruction following, large language models are nearly unusable as assistants. A model trained only on next-token prediction will “complete” a prompt like “Summarize this article:” by appending more article text, not a summary. Teaching the model to recognize and fulfill intent is what transforms a language model into a usable product.

How It Works

Instruction following is learned primarily through supervised fine-tuning (SFT) on (prompt, response) pairs. The model sees thousands of examples of “here is an instruction, here is what a good response looks like” and learns the pattern. Critically, LIMA (Zhou et al., 2023) showed that 1,000 high-quality, diverse examples are sufficient — the model generalizes to unseen task types because the underlying knowledge was already present from pretraining.

The key ingredients for strong instruction following:

  • Response format diversity: examples that vary in format (lists, paragraphs, code, step-by-step) teach the model to match format to task type
  • Task type diversity: examples spanning QA, summarization, creative writing, analysis, etc.
  • Quality over quantity: curated examples teach better style than large volumes of mediocre ones

Key Sources