What It Is
A foundation model is a large model trained at scale on broad data that can be adapted to many downstream tasks — either via fine-tuning, prompting, or zero-shot transfer — without retraining from scratch for each task.
Why It Matters
Before foundation models, each task required its own dataset, training loop, and architecture. Foundation models collapse this: one model, trained once, serves as the backbone for dozens of applications. GPT/BERT are foundation models for text; CLIP and SAM are foundation models for vision.
How It Works
The key properties:
- Scale: large enough that general capabilities emerge from the training data itself
- Breadth: trained on diverse data so the learned representations transfer across domains
- Promptability: the interface allows steering the model toward a specific task at inference time, without weight updates
The “foundation” metaphor is architectural: many applications are built on top of one shared model, rather than each application building its own base.