What It Is
Instruction following is a model’s ability to interpret a natural language prompt and produce a response that fulfills the intent — rather than just continuing the text statistically. A base pretrained model predicts the next token; an instruction-following model answers the question.
Why It Matters
Without instruction following, large language models are nearly unusable as assistants. A model trained only on next-token prediction will “complete” a prompt like “Summarize this article:” by appending more article text, not a summary. Teaching the model to recognize and fulfill intent is what transforms a language model into a usable product.
How It Works
Instruction following is learned primarily through supervised fine-tuning (SFT) on (prompt, response) pairs. The model sees thousands of examples of “here is an instruction, here is what a good response looks like” and learns the pattern. Critically, LIMA (Zhou et al., 2023) showed that 1,000 high-quality, diverse examples are sufficient — the model generalizes to unseen task types because the underlying knowledge was already present from pretraining.
The key ingredients for strong instruction following:
- Response format diversity: examples that vary in format (lists, paragraphs, code, step-by-step) teach the model to match format to task type
- Task type diversity: examples spanning QA, summarization, creative writing, analysis, etc.
- Quality over quantity: curated examples teach better style than large volumes of mediocre ones
Key Sources
- training-language-models-to-follow-instructions-with-human-feedback
- lima-less-is-more-for-alignment
- llava-visual-instruction-tuning