S&P Global launches groundbreaking AI benchmark for financial industry

S&P Global, a leading provider of financial intelligence, quietly announced on Wednesday the launch of S&P AI Benchmarks by Kensho. This innovative solution aims to set a new standard for evaluating the performance of large language models (LLMs) in complex financial and quantitative applications.

Developed by S&P Global’s AI-focused division, Kensho, the benchmarking tool assesses an LLM’s ability to handle tasks such as quantitative reasoning, data extraction from financial documents and demonstrating domain-specific knowledge. The results are then displayed on a leaderboard, providing a transparent view of each model’s capabilities.

S&P AI Benchmarks by Kensho ranks the performance of large language models (LLMs) across key financial and quantitative metrics, including domain knowledge, quantity extraction, and program synthesis. (Source: benchmarks.kensho.com)

“S&P AI Benchmarks combined Kensho’s cutting-edge AI research and engineering with S&P Global’s leading financial intelligence capabilities,” said Bhavesh Dayalji, Chief AI Officer for S&P Global and CEO of Kensho, in an interview with VentureBeat. “Our hope is that the solution becomes the industry standard for understanding how LLMs perform on complex financial reasoning and that it encourages broader innovation in the FinAI space.”

The launch of S&P AI Benchmarks comes at a pivotal moment for the financial services industry, as more institutions explore the potential of generative AI and LLMs to streamline operations and gain a competitive edge. However, the lack of standardized benchmarks has made it challenging for organizations to assess the suitability of different models for their specific use cases.

Fueling innovation and informed decision-making

“Benchmark solutions like ours are critical to helping institutions and professionals across our industry determine which LLMs they should be using for their particular use cases,” Dayalji explained. “And we believe that S&P AI Benchmarks is also going to fuel innovation by helping financial professionals identify where each model is performing well and how it can add the most value.”

The S&P AI Benchmarks methodology was developed and validated by a diverse team of experts, including engineers, researchers, academics and financial professionals from across S&P Global’s divisions. The evaluation set consists of 600 questions, designed to rigorously test an LLM’s performance across three key categories.

A milestone for AI adoption in finance

Industry analysts believe that the launch of S&P AI Benchmarks could mark a significant milestone in the adoption of AI within the financial sector. As more advanced AI permeates the financial industry, having a reliable and transparent benchmarking tool will be essential for firms looking to make informed decisions about which models to deploy. S&P Global’s solution could help accelerate the responsible adoption of LLMs and drive innovation in the FinAI space.

Looking ahead, S&P Global envisions S&P AI Benchmarks playing a crucial role in shaping the future of AI in financial services. “Our vision is to see LLMs become more effective and better adapted to the needs of the industries we operate in across the board, and solutions like ours will help us get there,” Dayalji said. “We encourage all model providers to participate so that we can continue to evolve our framework.”

As the financial industry navigates the rapidly evolving landscape of AI and generative AI, tools like S&P AI Benchmarks by Kensho are poised to become essential guides, helping organizations harness the power of these technologies while ensuring accuracy, transparency, and responsible deployment.