BioAgent Bench: An AI Agent Evaluation Suite for Bioinformatics
概要
arXiv:2601.21800v3 Announce Type: replace Abstract: This paper introduces BioAgent Bench, a benchmark dataset and an evaluation suite designed for measuring the performance and robustness of AI agents in common bioinformatics tasks. The benchmark contains curated end-to-end tasks (e.g., RNA-seq, va…