Understanding Performance Collapse in Layer-Pruned Large Language Models via Decision Representation Transitions
概要
arXiv:2605.07271v1 Announce Type: cross Abstract: Layer pruning efficiently reduces Large Language Model (LLM) computational costs but often triggers sudden performance collapse. Existing representation-based analyses struggle to explain this mechanism. We propose studying pruning through decision …