Measuring Black-Box Confidence via Reasoning Trajectories: Geometry, Coverage, and Verbalization
概要
arXiv:2605.06308v1 Announce Type: new Abstract: Reliable confidence estimation enables safe deployment of chain-of-thought (CoT) reasoning through text-only APIs. Yet the dominant black-box baseline, self-consistency over K samples, is linearly expensive and ignores the geometry of the trace. We pr…