What It Is

Self-consistency is a decoding strategy that samples the same prompt multiple times, then selects the answer by majority vote across samples. Rather than trusting a single model generation, it treats the model as a stochastic process and aggregates many samples.

Why It Matters

A single chain-of-thought generation can follow an incorrect reasoning path. Sampling multiple paths and taking the most common final answer dramatically reduces the probability of committing to an idiosyncratic error — without requiring any additional training or labeled data.

How It Works

Generate N outputs from the same model on the same input (with non-zero temperature). Extract the final answer from each output. Return the most common answer. For open-ended tasks, “most common” is approximated by semantic clustering. The diversity comes from the stochastic nature of sampling: each pass takes slightly different reasoning paths that lead to the same correct answer, but different paths when wrong.

Key Sources