ML Wiki

Tag: rlhf

2 items with this tag.

  • Apr 10, 2026

    Reward Model

    • concept
    • alignment
    • training
    • rlhf
  • Apr 10, 2026

    Training language models to follow instructions with human feedback (InstructGPT)

    • source
    • alignment
    • rlhf
    • llm
    • fine-tuning
    • safety