UniSD: Towards a Unified Self-Distillation Framework for Large Language Models
概要
arXiv:2605.06597v1 Announce Type: cross Abstract: Self-distillation (SD) offers a promising path for adapting large language models (LLMs) without relying on stronger external teachers. However, SD in autoregressive LLMs remains challenging because self-generated trajectories are free-form, correct…