Active teacher selection for reward learning
概要
arXiv:2310.15288v3 Announce Type: replace Abstract: Reward learning techniques enable machine learning systems to learn objectives from human feedback. A core limitation of these systems is their assumption that all feedback comes from a single human teacher, despite gathering feedback from large a…