arXiv cs.AI by Synapse Flow 編集部

BioAgent Bench: An AI Agent Evaluation Suite for Bioinformatics

概要

arXiv:2601.21800v3 Announce Type: replace Abstract: This paper introduces BioAgent Bench, a benchmark dataset and an evaluation suite designed for measuring the performance and robustness of AI agents in common bioinformatics tasks. The benchmark contains curated end-to-end tasks (e.g., RNA-seq, va…

元記事を読む →

関連記事