Attractor Geometry of Transformer Memory: From Conflict Arbitration to Confident Hallucination
概要
arXiv:2605.05686v1 Announce Type: new Abstract: Language models draw on two knowledge sources: facts baked into weights (parametric memory, PM) and information in context (working memory, WM). We study two mechanistically distinct failure modes--conflict, when PM and WM disagree and interfere; and …