Leviathan: Decoupling Input and Output Representations in Language Models
概要
arXiv:2601.22040v2 Announce Type: replace-cross Abstract: Modern language models use a single matrix for input embedding and output projection. This couples two distinct objectives: token representation and discrimination over a vocabulary. This work introduces Leviathan, a Transformer architecture…