Beyond Autoregressive RTG: Conditioning via Injection Outside Sequential Modeling in Decision Transformer
概要
arXiv:2605.06104v1 Announce Type: cross Abstract: Decision Transformer (DT) formulates offline reinforcement learning as autoregressive sequence modeling, achieving promising results by predicting actions from a sequence of Return-to-Go (RTG), state, and action tokens. However, RTG is a scalar that…