Stanford report: AI surpasses humans on several fronts, but costs are soaring

Join us in returning to NYC on June 5th to collaborate with executive leaders in exploring comprehensive methods for auditing AI models regarding bias, performance, and ethical compliance across diverse organizations. Find out how you can attend here.


Artificial intelligence made major strides in 2023 across technical benchmarks, research output, and commercial investment, according to a new report from Stanford University‘s Institute for Human-Centered AI. However, the technology still faces key limitations and growing concerns about its risks and societal impact.

The AI Index 2024 annual report, a comprehensive look at global AI progress, finds that AI systems exceeded human performance on additional benchmarks in areas like image classification, visual reasoning, and English understanding. However, they continue to trail humans on more complex tasks like advanced mathematics, commonsense reasoning, and planning.

“AI has surpassed human performance on several benchmarks,” the report says. “Yet it trails behind on more complex tasks like competition-level mathematics, visual commonsense reasoning, and planning.”

Surge in AI research and escalating costs

The report details an explosion of new AI research and development in 2023, with industry players leading the charge. Private companies produced 51 notable machine learning (ML) models last year, compared to only 15 from academia. Collaborations between industry and academia yielded an additional 21 high-profile models.

VB Event

The AI Impact Tour: The AI Audit

Join us as we return to NYC on June 5th to engage with top executive leaders, delving into strategies for auditing AI models to ensure fairness, optimal performance, and ethical compliance across diverse organizations. Secure your attendance for this exclusive invite-only event.

Request an invite

Costs to train cutting-edge AI systems skyrocketed, with OpenAI’s GPT-4 language model using an estimated $78 million worth of computing power. Google’s even larger Gemini Ultra model cost a staggering $191 million to train, according to estimates in the report.

“Frontier models get way more expensive,” the authors explain. “According to AI Index estimates, the training costs of state-of-the-art AI models have reached unprecedented levels.”

The number of notable machine learning models produced by industry far outpaced those from academia in 2023, according to data from Epoch analyzed in Stanford University’s AI Index 2024 report. Private sector labs released 51 state-of-the-art models last year, more than triple the 15 that originated from academic institutions. Credit: Stanford HAI

Geographic dominance in AI production

The United States dominated other countries in producing leading AI models, with 61 notable systems originating from U.S. institutions in 2023. China and the European Union trailed with 15 and 21 respectively. 

Investment trends painted a mixed picture. While overall private AI investment declined for a second year, funding for “generative AI” — systems that can produce text, images and other media — nearly octupled to $25.2 billion. Companies like OpenAI, Anthropic and Stability AI closed massive funding rounds.

“Generative AI investment skyrockets,” the report notes. “Despite a decline in overall AI private investment last year, funding for generative AI surged, nearly octupling from 2022 to reach $25.2 billion.”

The United States continues to dominate the global landscape in artificial intelligence development, according to an analysis by Epoch featured in Stanford University’s 2024 AI Index report. In 2023, American institutions produced 61 state-of-the-art machine learning models, dwarfing China’s output of 15 models and far outpacing other nations such as France, Germany, Canada and the United Kingdom. Credit: Stanford HAI

Need for standardized AI testing

As AI rapidly advances, the report finds a troubling lack of standardized testing of systems for responsibility, safety and security. Leading developers like OpenAI and Google primarily evaluate their models on different benchmarks, making comparisons difficult.

“Robust and standardized evaluations for [large language model] LLM responsibility are seriously lacking,” according to the AI Index analysis. “This practice complicates efforts to systematically compare the risks and limitations of top AI models.”

Emerging risks and public concern

The authors point to emerging risks, including the spread of political deepfakes which are “easy to generate and difficult to detect.” They also highlight new research revealing complex vulnerabilities in how language models can be manipulated to produce harmful outputs.

Public opinion data in the report shows growing anxiety about AI. The share of people who think AI will “dramatically” affect their lives in the next 3-5 years rose from 60% to 66% globally. More than half now express nervousness about AI products and services. 

Americans are growing more apprehensive about the expanding role of artificial intelligence in their daily lives, according to survey data from Pew Research featured in Stanford University’s 2024 AI Index report. The share of Americans who say they are more concerned than excited about the increasing use of AI has jumped sharply from 37% in 2021 to 52% in 2023, while the proportion who are more excited than concerned has fallen to 36%. Credit: Stanford HAI

“People across the globe are more cognizant of AI’s potential impact — and more nervous,” the report states. “In America, Pew data suggests that 52% of Americans report feeling more concerned than excited about AI, rising from 37% in 2022.”

As AI becomes more powerful and pervasive, the AI Index aims to provide an objective look at the state of the technology to inform policymakers, business leaders, and the general public. With AI at an inflection point, rigorous data will be crucial to navigate the opportunities and challenges ahead.