UniPool: A Globally Shared Expert Pool for Mixture-of-Experts
概要
arXiv:2605.06665v1 Announce Type: cross Abstract: Modern Mixture-of-Experts (MoE) architectures allocate expert capacity through a rigid per-layer rule: each transformer layer owns a separate expert set. This convention couples depth scaling with linear expert-parameter growth and assumes that ever…