ML Wiki

Tag: reward-model

2 items with this tag.

Apr 22, 2026
KTO: Model Alignment as Prospect Theoretic Optimization
Apr 17, 2026
Learning to Summarize from Human Feedback