arXiv cs.AI by Synapse Flow 編集部

BitCal-TTS: Bit-Calibrated Test-Time Scaling for Quantized Reasoning Models

概要

arXiv:2605.05561v1 Announce Type: new Abstract: Post-training quantization makes large reasoning models practical under tight memory and latency budgets, but it can distort the online signals that drive adaptive test-time compute allocation. Under a fixed cap on the number of newly generated tokens…

元記事を読む →

関連記事