Visual Text Compression as Measure Transport
概要
arXiv:2605.06708v1 Announce Type: cross Abstract: Visual text compression (VTC) promises efficient long-context processing by rendering text into an image and re-encoding it with a vision-language model, often producing $3$--$20\times$ fewer decoder tokens than subword tokenization. Yet token savin…