Beyond Retrieval: A Multitask Benchmark and Model for Code Search
概要
arXiv:2605.04615v2 Announce Type: replace-cross Abstract: Code search has usually been evaluated as first-stage retrieval, even though production systems rely on broader pipelines with reranking and developer-style queries. Existing benchmarks also suffer from data contamination, label noise, and d…