ML Wiki
Search
Search
Explorer
Tag: safety
1 item with this tag.
Apr 10, 2026
Training language models to follow instructions with human feedback (InstructGPT)
source
alignment
rlhf
llm
fine-tuning
safety