AEM: Adaptive Entropy Modulation for Multi-Turn Agentic Reinforcement Learning
概要
arXiv:2605.00425v2 Announce Type: replace Abstract: Reinforcement learning (RL) has substantially improved the ability of large language model (LLM) agents to interact with environments and solve multi-turn tasks. However, effective agentic RL remains challenging: sparse outcome-only rewards provid…