Emory University - 组织

New Benchmark Reveals Top AI Models Still Struggle With Visual Math Reasoning

Recent tests show GPT-4 Vision achieved only 49.9% accuracy on multimodal math problems, falling short of human performance. Researchers from Microsoft and Sahara AI argue current models lack the reasoning capabilities required for artificial general intelligence. This data highlights significant hurdles in the path toward AGI despite billions in investment.

La Era 18 de marzo de 2026