arXiv cs.AI by Synapse Flow 編集部

Decoupled Guidance Diffusion for Adaptive Offline Safe Reinforcement Learning

概要

arXiv:2605.02777v2 Announce Type: replace-cross Abstract: Offline safe reinforcement learning often requires policies to adapt at deployment time to safety budgets that vary across episodes or change within a single episode. While diffusion-based planners enable flexible trajectory generation, exis…

元記事を読む →

関連記事