arXiv cs.AI by Synapse Flow 編集部

TCM-Serve: Modality-aware Scheduling for Multimodal Large Language Model Inference

概要

arXiv:2603.26498v2 Announce Type: replace-cross Abstract: Multimodal Large Language Models (MLLMs) power platforms like ChatGPT, Gemini, and Copilot, enabling richer interactions with text, images, and videos. These heterogeneous workloads introduce additional inference stages, such as vision prepr…

元記事を読む →

関連記事