Revisiting Adam for Streaming Reinforcement Learning
概要
arXiv:2605.06764v1 Announce Type: cross Abstract: Learning from a sequence of interactions, as soon as observations are perceived and acted upon, without explicitly storing them, holds the promise of simpler, more efficient and adaptive algorithms. For over a decade, however, deep reinforcement lea…