arXiv cs.AI by Synapse Flow 編集部

Vibe Code Bench: Evaluating AI Models on End-to-End Web Application Development

概要

arXiv:2603.04601v2 Announce Type: replace-cross Abstract: Code generation has emerged as one of AI's highest-impact use cases, yet existing benchmarks measure isolated tasks rather than the complete "zero-to-one" process of building a working application from scratch. We introduce Vibe Code Bench, …

元記事を読む →

関連記事