arXiv cs.AI by Synapse Flow 編集部

Cascade Token Selection for Transformer Attention Acceleration

概要

arXiv:2605.03110v1 Announce Type: cross Abstract: A method is presented for reducing the cost of representative token selection in transformer attention layers by exploiting the coherence of the representative set across depth. Activation Decorrelation Attention (ADA) selects $r \ll T$ representati…

元記事を読む →

関連記事