arXiv cs.AI by Synapse Flow 編集部

EvoLM: Self-Evolving Language Models through Co-Evolved Discriminative Rubrics

概要

arXiv:2605.03871v1 Announce Type: new Abstract: Language models encode substantial evaluative knowledge from pretraining, yet current post-training methods rely on external supervision (human annotations, proprietary models, or scalar reward models) to produce reward signals. Each imposes a ceiling…

元記事を読む →

関連記事