arXiv cs.AI by Synapse Flow 編集部

MICA: Multi-granularity Intertemporal Credit Assignment for Long-Horizon Emotional Support Dialogue

概要

arXiv:2603.06194v2 Announce Type: replace-cross Abstract: Reinforcement learning (RL) for large language models (LLMs) has shown strong performance in single-turn tasks, but extending it to multi-turn interaction remains challenging due to sparse rewards and poor per-turn credit assignment. In emot…

元記事を読む →

関連記事