A memory palace for frontier ML. Every page is written so you can re-derive the idea from first principles, six months from now, without re-reading the source.
120 concepts · 98 sources · 16 entities
How to use this site. Skim Overview for scope. For a structured way in, follow a Learning Path. To track what’s still unknown, read Open Threads. To see what’s queued, read Reading Queue. The graph view (top-right) is the recall map — pages cluster by neighborhood, not by directory.
Anchor Pages — if you only remember twelve
If these twelve pages stay sharp, the rest can be reconstructed. Each links into a dense neighborhood; together they cover architecture, training, alignment, inference, scale, and reasoning.
- Attention — the routing primitive that replaced recurrence.
- Transformer — the block that scales.
- Scaling Laws — the map: loss as a power law in compute, data, params.
- Pre-training — next-token prediction at scale; what makes a base model.
- SFT — turning a base model into a follower of instructions.
- RLHF — preferences over a reward model, optimized with PPO.
- DPO — RLHF without RL: the policy is the reward model.
- In-Context Learning — why prompting works at all.
- Chain-of-Thought — exposing reasoning to the model itself.
- KV Cache — why inference is fast and why it costs memory.
- MoE — sparse activation; parameters without proportional compute.
- Emergent Abilities — and the active debate over whether the phase change is real.
Recently Ingested
- LLaVA-1.5: Improved Baselines with Visual Instruction Tuning — 2026-05-09
- CodeAct: Executable Code Actions Elicit Better LLM Agents — 2026-05-09
- SatMAE: Pre-training Transformers for Temporal and Multi-Spectral Satellite Imagery — 2026-05-09
- t2vec: Deep Representation Learning for Trajectory Similarity Computation — 2026-05-09
- Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks — 2026-05-09
- ColBERT: Efficient and Effective Passage Search via Contextualized Late Interaction — 2026-05-09
Open Threads
Forward-looking research questions tracked across sources. Each thread accumulates evidence and updates its current best understanding as new work lands.
- Does DPO scale reliably past 70B?
- When does long context actually fail?
- Are emergent abilities real or a metric artifact?
- What’s the right alignment stack post-RLHF?
- Where does RL-on-verifiable-rewards stop generalizing?
Browse all threads.
Learning Paths
Sequences, not browsing. Each path is ordered so each step makes the next legible.
- How LLMs Are Trained: From Scratch to RLHF
- Making LLMs Fast: The Inference Efficiency Stack
- From Pixels to Understanding: Vision-Language Models
- From Prompting to Agency: Reasoning and Tool-Using LLMs
- How Diffusion Works: From DDPM to Latent Diffusion
Concepts by Theme
Architecture Transformer · Attention · FlashAttention · Mamba · Positional Encoding · Vision Transformer · Early Fusion · Patch Embeddings · Encoder-Decoder · Cross-Attention · Sliding-Window Attention · Grouped-Query Attention · Residual Connections
Training Pre-training · Scaling Laws · Compute-Optimal Training · Grokking · Distillation · SFT · RLHF · DPO · PPO · GRPO · Alignment · Constitutional AI · Contrastive Learning · Emergent Abilities · Phase Transition · Reward Model · LoRA · Fine-Tuning
Inference KV Cache · Speculative Decoding · Quantization · Continuous Batching · Inference Efficiency · Memory Efficiency · Sampling · Dynamic Computation
Reasoning & Agents In-Context Learning · Chain-of-Thought · Self-Consistency · Self-Critique · Reasoning RL · Tool Use · Instruction Following · RAG
Multimodal & Vision Multimodal Embeddings · Multimodal Instruction Tuning · Vision-Language Models · Visual Grounding · Open-Vocab Segmentation · Promptable Segmentation · Diffusion Models · Video Generation · Zero-Shot Transfer · Latent Space · VAE
Foundations Optimization · SGD · Momentum · Adaptive LR · Bias Correction · Batch Norm · Vanishing Gradients · Inductive Bias · Tokenization · Subword Units · Vocabulary · Temperature · Uncertainty
Browse all concepts.
Entities
Labs. OpenAI · Anthropic · Google DeepMind · Google Brain · Google Research · FAIR · Microsoft Research · DeepSeek · TII UAE · UC Berkeley Sky Lab · Stanford Hazy Research
Authors. Ashish Vaswani · Noam Shazeer · Jason Wei · Tri Dao · Albert Gu
Browse all entities.
Wiki Health
- Stub backlog: recall-audit tracks pages still on the old
What It Is / Why It Matters / How It Worksschema. Upgrading them to recall-first is the standing chore. - Validate:
python3 check_broken_links.pybefore commit. - Refresh:
python3 update_index.pyregenerates stats and Recently Ingested.