ML Wiki

Tag: pretraining

5 items with this tag.

  • Apr 10, 2026

    BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding

    • source
    • bert
    • pretraining
    • bidirectional
    • nlp
    • fine-tuning
    • masked-lm
  • Apr 10, 2026

    Language Models are Few-Shot Learners

    • source
    • gpt-3
    • scaling
    • in-context-learning
    • few-shot
    • pretraining
  • Apr 10, 2026

    LLaMA: Open and Efficient Foundation Language Models

    • source
    • llama
    • open-weights
    • pretraining
    • efficiency
    • foundation-models
  • Apr 10, 2026

    Scaling Laws for Neural Language Models

    • source
    • scaling
    • compute
    • scaling-laws
    • pretraining
    • language-models
  • Apr 10, 2026

    Training Compute-Optimal Large Language Models

    • source
    • chinchilla
    • scaling
    • compute
    • scaling-laws
    • pretraining