ML Wiki

Tag: efficiency

8 items with this tag.

  • Apr 27, 2026

    Long Context

    • concept
    • architecture
    • attention
    • efficiency
  • Apr 24, 2026

    Dynamic Computation

    • concept
    • architecture
    • efficiency
    • inference
  • Apr 22, 2026

    Model Compression

    • concept
    • efficiency
    • inference
    • distillation
  • Apr 17, 2026

    Making LLMs Fast — The Inference Efficiency Stack

    • learning-path
    • inference
    • efficiency
  • Apr 16, 2026

    Mixture of Experts (MoE)

    • concept
    • architecture
    • scaling
    • efficiency
    • sparse
  • Apr 16, 2026

    Switch Transformers: Scaling to Trillion Parameter Models with Sparse MoE

    • source
    • mixture-of-experts
    • moe
    • scaling
    • efficiency
    • sparse
  • Apr 10, 2026

    LLaMA: Open and Efficient Foundation Language Models

    • source
    • llama
    • open-weights
    • pretraining
    • efficiency
    • foundation-models
  • Apr 04, 2026

    Distillation (Knowledge Distillation)

    • concept
    • training
    • efficiency