Adaptive auditing of AI systems with anytime-valid guarantees
概要
arXiv:2605.07002v1 Announce Type: new Abstract: A major bottleneck in characterizing the failure modes of generative AI systems is the cost and time of annotation and evaluation. Consequently, adaptive testing paradigms have gained popularity, where one opportunistically decides which cases and how…