Encoder-Decoder

What It Is

A neural network architecture split into two parts: an encoder that processes the input sequence into rich representations, and a decoder that generates the output sequence by attending to those representations one token at a time.

Why It Matters

Encoder-decoder architectures are the natural fit for sequence-to-sequence tasks — translation, summarization, question answering — where the input and output may differ in length and structure. The encoder can see the full input bidirectionally; the decoder generates autoregressively.

How It Works

The encoder runs self-attention over the full input sequence, producing a set of context-enriched token representations. The decoder runs causal self-attention over its own outputs so far, then cross-attends to the encoder’s output to decide what to generate next. This cross-attention mechanism is how the decoder “reads” the input while generating the output.

Key Sources

attention-is-all-you-need
bart-denoising-sequence-to-sequence-pre-training
t5-exploring-the-limits-of-transfer-learning

attention
transformer
pre-training
fine-tuning

ML Wiki

Explorer

Encoder-Decoder

What It Is

Why It Matters

How It Works

Key Sources

Graph View

Table of Contents

Backlinks

ML Wiki

Explorer

Encoder-Decoder

What It Is

Why It Matters

How It Works

Key Sources

Related Concepts

Graph View

Table of Contents

Backlinks