AdaGamma: State-Dependent Discounting for Temporal Adaptation in Reinforcement Learning
概要
arXiv:2605.06149v1 Announce Type: cross Abstract: The discount factor in reinforcement learning controls both the effective planning horizon and the strength of bootstrapping, yet most deep RL methods use a single fixed value across all states. While state-dependent discounting is conceptually appe…