arXiv cs.AI by Synapse Flow 編集部

Finite-Size Gradient Transport in Large Language Model Pretraining: From Cascade Size to Intensive Transport Efficiency

概要

arXiv:2605.02968v1 Announce Type: cross Abstract: We introduce a finite-size gradient-transport framework for real language-model training, based on five observables $(D,z,\beta,\delta,v_{\mathrm{rel}})$ that separate cascade size, duration, absolute transport, and intensive transport efficiency. W…

元記事を読む →

関連記事