Best Arm Identification in Generalized Linear Bandits via Hybrid Feedback
概要
arXiv:2605.05745v1 Announce Type: new Abstract: We study fixed-confidence best arm identification in generalized linear bandits under a hybrid feedback model: at each round, the learner may query either (i) absolute reward feedback from a single arm or (ii) relative (dueling) feedback from an arm p…