What It Is

Emergent behaviors are capabilities that appear in large models but are absent in smaller ones — and cannot be predicted by extrapolating from smaller scale. Performance is flat (near-random) for many orders of magnitude of compute, then suddenly jumps at a threshold.

Why It Matters

It means scaling is not always smooth. Some capabilities you can’t buy incrementally — you either have them or you don’t. This makes capability prediction hard, and is central to AI safety debates about whether dangerous capabilities could appear suddenly at scale.

Examples

  • 3-digit arithmetic: absent below ~13B params, sharp jump above
  • Chain-of-thought effectiveness: hurts below ~68B, helps above
  • Instruction following: hurts below ~8B when fine-tuned, helps above
  • Multilingual translation: absent in small models, appears at scale

The Controversy

Some researchers argue emergence is a measurement artifact: if you use a finer-grained metric, the improvement is gradual. Others argue the phase transitions are real. The debate is unresolved.

Key Sources