Latent Space

What It Is

A compressed, lower-dimensional representation of data learned by an encoder — the internal coordinate system a model uses to represent inputs. Points close together in latent space correspond to semantically similar inputs.

Why It Matters

Operating in latent space rather than raw pixel/token space is the key efficiency trick behind Stable Diffusion (Latent Diffusion Models). A 512×512 image has 786,432 values; its latent representation might be 64×64×4 = 16,384 values — 48x fewer. Running diffusion in this compressed space slashes compute while preserving semantic content.

How It Works

An encoder (typically a VAE) maps high-dimensional inputs $x$ to a lower-dimensional latent $z = E (x)$ . A decoder reconstructs the original: $\overset{x}{^} = D (z)$ . Generative models like diffusion models learn to operate in this latent space — adding and removing noise from $z$ rather than from $x$ directly. The decoder is only needed at inference time to convert latent samples back to pixels.

ML Wiki

Explorer

Latent Space

What It Is

Why It Matters

How It Works

Key Sources

Graph View

Table of Contents

Backlinks

ML Wiki

Explorer

Latent Space

What It Is

Why It Matters

How It Works

Key Sources

Related Concepts

Graph View

Table of Contents

Backlinks