AI continues to improve – at least according to benchmarks. But the promised benefits have largely yet to materialize while models are increasing in size and becoming more computationally demanding, and greenhouse gas emissions from AI training continue to rise.

These are some of the takeaways from the AI Index Report 2025 [PDF], a lengthy and in-depth publication from Stanford University’s Institute for Human-Centered AI (HAI) that covers development, investment, adoption, governance and even global attitudes towards artificial intelligence, giving a snapshot of the current state of play.

In terms of performance, the researchers state that AI models are increasingly mastering new and challenging benchmarks designed to test their capabilities, including MMMU, GPQA, and SWE-bench. With the latter, which measures success in solving actual coding problems from GitHub, AI systems managed just 4.4 percent in 2023, but this jumped to 71.7 percent last year.

According to HAI, last year’s AI Index highlighted that many models had already surpassed human performance on a range of tasks, with only a few exceptions, such as competition-level mathematics and visual commonsense reasoning. This trend largely continued over the past year, with models closing performance gaps and matching or exceeding humans on even more demanding benchmarks.

If you think that sounds depressing, the report also stresses that complex reasoning is still out of reach for AI models. Even with mechanisms such as chain-of-thought reasoning to boost their performance, large language models (LLMs) are unable to reliably solve problems for which a solution can be found using logical reasoning, making them unsuitable still for many applications.

However, HAI highlights the enormous level of investment still being pumped into the sector, with global corporate AI investment reaching $252.3 billion in 2024, up 26 percent for the year. Most of this is in the US, which hit $109.1 billion, nearly 12 times higher than China’s $9.3 billion and 24 times the UK’s $4.5 billion, it says.

Most companies that report financial impacts from using AI within a business function estimate the benefits as being at low levels

Despite all this investment, “most companies that report financial impacts from using AI within a business function estimate the benefits as being at low levels,” the report writes.

It says that 49 percent of organizations using AI in service operations reported cost savings, followed by supply chain management (43 percent) and software engineering (41 percent), but in most cases, the cost savings are less than 10 percent.

When it comes to revenue gains, 71 percent of respondents using AI in marketing and sales reported gains, while 63 percent in supply chain management and 57 percent in service operations, but the most common level of revenue increase is less than 5 percent.

The report claims, “AI is beginning to deliver financial impact across business functions, but most companies are early in their journeys” – an excuse we’ve been hearing for some time now.

Meanwhile, despite the modest returns, the HAI report warns that the amount of compute used to train top-notch AI models is doubling approximately every 5 months, the size of datasets required for LLM training is doubling every eight months, and the energy consumed for training is doubling annually.

This is leading to rapidly increasing greenhouse gas emissions resulting from AI training, the report finds. It says that early AI models such as AlexNet over a decade ago caused only modest CO₂ emissions of 0.01 tons, while GPT-4 (2023) was responsible for emitting 5,184 tons, and Llama 3.1 405B (2024) pumping out 8,930 tons. This compares with about 18 tons of carbon a year the average American

 » …
Read More