Grounding Multi-Hop Reasoning in Structural Causal Models via Group Relative Policy Optimization
概要
arXiv:2605.01482v2 Announce Type: replace Abstract: Multi-Hop Fact Verification (MHFV) necessitates complex reasoning across disparate evidence, posing significant challenges for Large Language Models (LLMs) which often suffer from hallucinations and fractured logical chains. Existing methods, whil…