Learning Visual Feature-Based World Models via Residual Latent Action
概要
arXiv:2605.07079v1 Announce Type: cross Abstract: World models predict future transitions from observations and actions. Existing works predominantly focus on image generation only. Visual feature-based world models, on the other hand, predict future visual features instead of raw video pixels, off…