Learning Discrete Autoregressive Priors with Wasserstein Gradient Flow
概要
arXiv:2605.06148v1 Announce Type: cross Abstract: Discrete image tokenizers are commonly trained in two stages: first for reconstruction, and then with a prior model fitted to the frozen token sequences. This decoupling leaves the tokenizer unaware of the model that will later generate its tokens. …