AI Surpasses Humans in Key Tasks: The Rising Benchmarking Challenges in 2025

by cnr_staff August 1, 2025

Artificial Intelligence (AI) is no longer a futuristic concept—it’s here, and it’s outperforming humans in key tasks. From hyper-realistic image generation to autonomous vehicles, AI’s rapid advancements are reshaping industries and raising critical benchmarking challenges. How do we measure AI’s capabilities when it’s evolving faster than our benchmarks?

AI Surpasses Humans: The Benchmarking Dilemma

According to Russell Wald of Stanford’s Institute for Human-Centered Artificial Intelligence (HAI), AI has outperformed humans in most task categories as of 2024. The gap is narrowing swiftly, making it harder to evaluate AI models. For example, Midjourney’s text-to-image generator evolved from cartoonish renderings in 2022 to hyper-realistic portraits by 2024.

AI Advancements: From Medical Devices to Autonomous Rides

223 AI-enabled medical devices were approved by the U.S. FDA in 2023, up from just six in 2015.
Waymo provides over 150,000 autonomous rides weekly in San Francisco.
Baidu’s Apollo Go robotaxi is expanding across multiple Chinese cities.

AI Industry Dominance: The Cost of Innovation

Industry players now dominate AI model development, with 90% of notable models in 2024 coming from corporations. Training costs remain high—Google spent $930 to train the transformer model in 2017, while Gemini Ultra cost $200 million in 2024. Despite this, open-weight models are catching up, narrowing performance gaps from 8% to 1.7% in benchmark tests.

Global AI Trends: China vs. The U.S.

China is closing the gap with the U.S. in AI performance, thanks to its focus on open-source environments and talent investment. Public opinion on AI is also more favorable in non-Western nations, with 83% approval in China compared to 39% in the U.S.

The Future of AI: Challenges and Opportunities

While AI’s progress is exciting, the lack of standardized benchmarks for safety and responsibility remains a critical issue. The AI Index continues to monitor these developments, offering insights into AI’s global expansion.

Frequently Asked Questions (FAQs)

1. How is AI surpassing humans in key tasks?

AI has outperformed humans in most task categories, such as image generation, medical diagnostics, and autonomous driving, due to rapid advancements in machine learning and hardware efficiency.

2. What are the benchmarking challenges with AI?

As AI evolves faster than human benchmarks, researchers struggle to define and measure its capabilities accurately, especially in hyper-realistic applications like text-to-image generation.

3. Which countries are leading in AI development?

The U.S. currently leads, but China is closing the gap due to its focus on open-source environments and talent investment.

4. How has AI adoption grown in businesses?

78% of organizations now use AI in at least one business function, up from 55% in 2023, driven by falling inference costs and improved energy efficiency.

5. What are the costs of training advanced AI models?

Training costs have skyrocketed, from $930 for Google’s transformer model in 2017 to $200 million for Gemini Ultra in 2024, making it prohibitive for academia and small players.