Modality Gap-Driven Subspace Alignment Training Paradigm For Multimodal Large Language Models
概要
arXiv:2602.07026v2 Announce Type: replace-cross Abstract: Despite the success of multimodal contrastive learning in aligning visual and linguistic representations, a persistent geometric anomaly, the Modality Gap, remains: embeddings of distinct modalities expressing identical semantics occupy syst…