ML Wiki

Tag: dpo

2 items with this tag.

  • Apr 22, 2026

    KTO: Model Alignment as Prospect Theoretic Optimization

    • source
    • alignment
    • rlhf
    • dpo
    • reward-model
    • training
  • Apr 17, 2026

    KTO: Model Alignment as Prospect Theoretic Optimization

    • source
    • alignment
    • dpo
    • rlhf
    • preference-learning