Pages still on the old skeleton (## What It Is / ## Why It Matters / ## How It Works for concepts; ## Summary / ## Key Claims / ## Methods for sources) and not yet rewritten to the recall-first schema (see CLAUDE.md at the repo root).
Stub pages are usable but unmemorable — they describe the topic without forcing the reader through the Problem → Insight → Mechanism → Walkthrough → What’s Clever arc that locks ideas into recall. Upgrading them is the standing chore.
Counts as of 2026-04-28: 75 concept stubs · 15 source stubs.
Priority 1 — Anchor pages still on the old schema
These are linked from the homepage as “anchor pages” — the twelve concepts that, if remembered, let everything else be reconstructed. An anchor page on the old schema is a hole in the recall structure. Fix these first.
- scaling-laws — anchor: the map for the whole field.
- pre-training — anchor: what makes a base model.
- sft — anchor: turning a base model into an instruction-follower.
- rlhf — anchor: preferences over a reward model.
- mixture-of-experts — anchor: sparse activation at scale.
- emergent-abilities — anchor: the active debate.
- transformer — anchor: the block that scales.
That’s seven of the twelve anchors still on the old schema. Each one rewritten to recall-first is a disproportionate gain.
Priority 2 — High-traffic stubs frequently linked from sources
Concept stubs that source pages reach for repeatedly. Upgrading these compounds: every source page citing them gets sharper without the source page itself being rewritten.
- alignment
- long-context
- reward-model
- reinforcement-learning
- policy-gradient
- ppo
- grpo
- constitutional-ai
- diffusion-models
- positional-encoding
- tokenization
- distillation
- grokking
- phase-transition
- rag
- inference-efficiency
- contrastive-learning
- self-consistency
- chain-of-thought (present in tier-1 list — verify before editing)
Priority 3 — Remaining concept stubs
Less central but still on the old schema. Promote opportunistically when a relevant source is ingested.
- ai-feedback
- batch-normalization
- bidirectional-context
- classification-token
- code-generation
- compression
- compute-optimal-training
- continuous-batching
- cross-attention
- data-augmentation
- data-quality
- denoising
- dynamic-computation
- early-fusion
- emergent-behavior
- encoder-decoder
- ensemble-methods
- fine-tuning
- foundation-models
- gqa
- harmlessness
- inductive-bias
- instruction-following
- latent-space
- masked-language-model
- memory-efficiency
- multimodal-embeddings
- multimodal-instruction-tuning
- open-vocabulary-segmentation
- patch-embeddings
- power-laws
- promptable-segmentation
- reasoning-rl
- residual-connections
- sampling
- self-critique
- self-supervised-learning
- sliding-window-attention
- subword-units
- temperature-scaling
- tool-use-agents
- transfer-learning
- uncertainty-estimation
- vae
- vanishing-gradients
- video-generation
- vision-language-models
- vision-transformer
- visual-grounding
- vocabulary
- zero-shot-transfer
Source pages on the old skeleton
Source pages can stay on the lighter Summary / Key Claims / Methods skeleton if the source is supporting material, but key sources benefit from the longer recall-first source schema (with ## The core idea / ## Walkthrough / ## What's clever / ## So what). Examples to copy from: attention-is-all-you-need, direct-preference-optimization-your-language-model-is-secretly-a-reward-model (note: DPO source uses an older variant; could be tightened).
Candidates:
- training-language-models-to-follow-instructions-with-human-feedback — InstructGPT. Foundational; deserves a full recall-first treatment.
- scaling-laws-neural-language-models — Foundational.
- training-compute-optimal-large-language-models — Chinchilla. Foundational.
- language-models-are-few-shot-learners — GPT-3.
- an-image-is-worth-16x16-words — ViT.
- clip-learning-transferable-visual-models — CLIP.
- bert-pre-training-of-deep-bidirectional-transformers — BERT.
- llama-open-efficient-foundation-language-models — LLaMA 1.
- grokking-generalization-beyond-overfitting
- mixtral-of-experts
- flash-attention-2
- metis-hdpo-meta-cognitive-tool-use
- numina-counting-text-to-video
- neural-computers
How to chip away
When ingesting a source, look at which stub concept pages it cites. If the source is rich on that concept, rewrite the stub to recall-first using this source as a primary input — note the upgrade in the commit message (upgrade: <concept-slug> to recall-first).
A reasonable cadence: one anchor-page upgrade per ingest until Priority 1 is empty.