Skip-It? Theoretical Conditions for Layer Skipping in Vision-Language Models
概要
arXiv:2509.25584v2 Announce Type: replace Abstract: Vision-language models achieve incredible performance across a wide range of tasks, but their large size makes inference costly. Recent work has shown that multimodal processing contains significant redundancies, making it possible to skip certain…