The future of financial analysis: How GPT-4 is disrupting the industry, according to new research

Don’t miss OpenAI, Chevron, Nvidia, Kaiser Permanente, and Capital One leaders only at VentureBeat Transform 2024. Gain essential insights about GenAI and expand your network at this exclusive three day event. Learn More

Researchers from the University of Chicago have demonstrated that large language models (LLMs) can conduct financial statement analysis with accuracy rivaling and even surpassing that of professional analysts. The findings, published in a working paper titled “Financial Statement Analysis with Large Language Models,” could have major implications for the future of financial analysis and decision-making.

The researchers tested the performance of GPT-4, a state-of-the-art LLM developed by OpenAI, on the task of analyzing corporate financial statements to predict future earnings growth. Remarkably, even when provided only with standardized, anonymized balance sheets, and income statements devoid of any textual context, GPT-4 was able to outperform human analysts.

“We find that the prediction accuracy of the LLM is on par with the performance of a narrowly trained state-of-the-art ML model,” the authors write. “LLM prediction does not stem from its training memory. Instead, we find that the LLM generates useful narrative insights about a company’s future performance.”

A study by researchers at the University of Chicago found that OpenAI’s GPT-4 model outperformed human analysts in predicting corporate earnings, achieving an accuracy score of 0.604 and an F1 score of 0.609. The researchers used a novel approach of providing structured financial data and “chain-of-thought” prompts to guide the AI’s reasoning. (Source: University of Chicago)

Chain-of-thought prompts emulate human analyst reasoning

A key innovation was the use of “chain-of-thought” prompts that guided GPT-4 to emulate the analytical process of a financial analyst, identifying trends, computing ratios, and synthesizing the information to form a prediction. This enhanced version of GPT-4 achieved a 60% accuracy in predicting the direction of future earnings, notably higher than the 53-57% range of human analyst forecasts.

Countdown to VB Transform 2024

Join enterprise leaders in San Francisco from July 9 to 11 for our flagship AI event. Connect with peers, explore the opportunities and challenges of Generative AI, and learn how to integrate AI applications into your industry. Register Now

“Taken together, our results suggest that LLMs may take a central role in decision-making,” the researchers conclude. They note that the LLM’s advantage likely stems from its vast knowledge base and ability to recognize patterns and business concepts, allowing it to perform intuitive reasoning even with incomplete information.

University of Chicago researchers tested GPT4’s financial analysis capabilities by providing it with anonymized, standardized financial statements and guiding its reasoning with “chain-of-thought” prompts. The model then predicted the direction, magnitude, and confidence of future earnings changes. (Source: University of Chicago)

LLMs poised to transform financial analysis despite challenges

The findings are all the more remarkable given that numerical analysis has traditionally been a challenge for language models. “One of the most challenging domains for a language model is the numerical domain, where the model needs to carry out computations, perform human-like interpretations, and make complex judgments,” said Alex Kim, one of the study’s co-authors. “While LLMs are effective at textual tasks, their understanding of numbers typically comes from the narrative context and they lack deep numerical reasoning or the flexibility of a human mind.”

Some experts caution that the “ANN” model used as a benchmark in the study may not represent the state-of-the-art in quantitative finance. “That ANN benchmark is nowhere near state of the art,” commented one practitioner on the Hacker News forum. “People didn’t stop working on this in 1989 — they realized they can make lots of money doing it and do it privately.”

Nevertheless, the ability of a general-purpose language model to match the performance of specialized ML models and exceed human experts points to the disruptive potential of LLMs in the financial domain. The authors have also created an interactive web application to showcase GPT-4’s capabilities for curious readers, though they caution that its accuracy should be independently verified.

As AI continues its rapid advance, the role of the financial analyst may be the next to be transformed. While human expertise and judgment are unlikely to be fully replaced anytime soon, powerful tools like GPT-4 could greatly augment and streamline the work of analysts, potentially reshaping the field of financial statement analysis in the years to come.