MIST: Multimodal Interactive Speech-based Tool-calling Conversational Assistants for Smart Homes
概要
arXiv:2605.06897v1 Announce Type: cross Abstract: The rise of Internet of Things (IoT) devices in the physical world necessitates voice-based interfaces capable of handling complex user experiences. While modern Large Language Models (LLMs) already demonstrate strong tool-usage capabilities, modeli…