What It Is

A power law is a relationship of the form , where the exponent k is constant. On a log-log plot, a power law appears as a straight line with slope k. Power laws appear throughout physics, biology, economics, and — as Kaplan et al. showed — language model performance.

Why It Matters

Power laws are predictive across orders of magnitude. Once you fit the line at small scale, you can read off predicted values at large scale with high confidence. This is what makes scaling laws practically useful: small experiments become predictors of large-scale behavior.

How It Works

In the context of language model scaling:

The exponent is the slope on the log-log plot. Doubling N multiplies L by — a 5% reduction in loss per doubling. The trend holds from to parameters across the Kaplan et al. experiments.

The practical consequence: smooth, predictable improvement means you never plateau unexpectedly when scaling — until you do, at which point you have hit an emergent capability threshold that the power law was not measuring.

Key Sources