Response-G1: Explicit Scene Graph Modeling for Proactive Streaming Video Understanding
概要
arXiv:2605.07575v1 Announce Type: cross Abstract: Proactive streaming video understanding requires Video-LLMs to decide when to respond as a video unfolds, a task where existing methods often fall short due to their implicit, query-agnostic modeling of visual evidence. We introduce Response-G1, a n…