Contention Mechanisms
Preference Modeling
Process of creating a reward model that learns human preferences from pairwise response comparisons, essential for RLHF.
← BackProcess of creating a reward model that learns human preferences from pairwise response comparisons, essential for RLHF.
← Back