Experience Sharing in Mutual Reinforcement Learning for Heterogeneous Language Models
概要
arXiv:2605.07244v1 Announce Type: cross Abstract: We introduce Mutual Reinforcement Learning, a framework for concurrent RL post-training in which heterogeneous LLM policies exchange typed experience while keeping separate parameters, objectives, and tokenizers. The framework combines a Shared Expe…