$f$-Divergence Regularized RLHF: Two Tales of Sampling and Unified Analyses
概要
arXiv:2605.06977v1 Announce Type: cross Abstract: Reinforcement Learning from Human Feedback (RLHF) has become a cornerstone technique for post-training large language models. While most existing approaches rely on the reverse KL-regularization, recent empirical studies have begun exploring alterna…