Motion-o: Trajectory-Grounded Video Reasoning
概要
arXiv:2603.18856v2 Announce Type: replace-cross Abstract: Recent video reasoning models increasingly produce spatio-temporal evidence chains that localize objects at specific timestamps. While these traces improve interpretability by grounding \emph{where} and \emph{when} evidence appears, they oft…