Caracal: Causal Architecture via Spectral Mixing
概要
arXiv:2605.00292v2 Announce Type: replace-cross Abstract: The scalability of Large Language Models to long sequences is hindered by the quadratic cost of attention and the limitations of positional encodings. To address these, we introduce Caracal, a novel architecture that replaces attention with …