ML Wiki

Tag: transformers

11 items with this tag.

May 02, 2026
Train Short, Test Long: Attention with Linear Biases Enables Input Length Extrapolation
Apr 20, 2026
GQA: Grouped-Query Attention — How Modern LLMs Got 5x Faster Without Losing Quality
Apr 13, 2026
Bidirectional Context
Apr 13, 2026
Fine-tuning
Apr 13, 2026
Masked Language Model
Apr 13, 2026
Pre-training
Apr 13, 2026
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
Apr 11, 2026
Classification Token (CLS Token)
Apr 11, 2026
Patch Embeddings
Apr 11, 2026
An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale
Apr 06, 2026
RoPE: Enhanced Transformer with Rotary Position Embedding