Learn where to Click from Yourself: On-Policy Self-Distillation for GUI Grounding
概要
arXiv:2605.00642v2 Announce Type: replace Abstract: Graphical User Interface (GUI) grounding maps natural language instructions to the visual coordinates of target elements and serves as a core capability for autonomous GUI agents. Recent reinforcement learning methods (e.g., GRPO) have achieved st…