Apple’s $25-50 million Shutterstock deal highlights fierce competition for AI training data

Time’s almost up! There’s only one week left to request an invite to The AI Impact Tour on June 5th. Don’t miss out on this incredible opportunity to explore various methods for auditing AI models. Find out how you can attend here.

Apple has entered into a significant agreement with stock photography provider Shutterstock to license millions of images for training its artificial intelligence models. According to a Reuters report, the deal is estimated to be worth between $25 million and $50 million, placing Apple among several tech giants racing to secure vast troves of data to power their AI systems.

Sources familiar with the matter told Reuters that Apple, along with Meta, Google and Amazon, has struck licensing agreements with Shutterstock in recent months to access hundreds of millions of images, videos, and music files from its library. While the exact terms of Apple’s deal remain undisclosed, Shutterstock’s Chief Financial Officer Jarrod Yahes confirmed to Reuters that the initial agreements with these tech firms ranged from $25-50 million each, with most later being expanded.

The growing demand for AI training data has given rise to a bustling market, with companies turning to a variety of sources to acquire content. Reuters spoke to more than 30 people with knowledge of such deals, revealing that prices can vary widely depending on the type of content and the buyer. For instance, Daniela Braga, CEO of AI data firm, told Reuters that companies generally pay $1-2 per image, $2-4 per short-form video, and $100-300 per hour of longer films, with rates for text at around $0.001 per word.

Privacy concerns loom large

As AI has rapidly advanced in the last several years, a fierce battle has started brewing over the data that is used to train these systems. Tech giants like OpenAI, Google, Meta and Microsoft have been using vast troves of online data, including copyrighted news articles, books and social media posts, to train their AI models without explicit permission or compensation to content creators.

June 5th: The AI Audit in NYC

Join us next week in NYC to engage with top executive leaders, delving into strategies for auditing AI models to ensure fairness, optimal performance, and ethical compliance across diverse organizations. Secure your attendance for this exclusive invite-only event.

This has sparked outrage from publishers and creators who argue their intellectual property is being exploited. The New York Times recently sued OpenAI and Microsoft for copyright infringement, alleging that millions of NYT articles were used to train chatbots that now compete with the newspaper. The suit seeks billions in damages and the destruction of AI models containing NYT content.

Amid the legal challenges, some are calling for a licensing system where AI companies would pay content owners for training data access. At a Senate hearing, lawmakers from both parties backed media industry demands for OpenAI and others to license news articles and other data used to train AI. Leaders from the National Association of Broadcasters, News Media Alliance and Condé Nast spoke in favor of mandatory licensing, with some arguing unauthorized data use violates copyright law.

However, OpenAI and some experts contend licensing all training data is unviable, and mandatory licensing could concentrate power among big tech firms while burdening AI startups. There is still debate whether licensing should be legally required or simply an industry norm.

Meanwhile, some companies are striking lucrative data deals. Google reportedly inked a $60 million per year agreement for exclusive access to Reddit data to train its AI systems. The rapidly evolving legal and ethical landscape around AI training data will be crucial to the future development of the industry. Privacy concerns remain at the forefront as the battle over this valuable resource escalates.

The Battle for AI Supremacy

The Shutterstock deal underscores the critical role that data plays in the development of cutting-edge AI systems. As companies like Apple, Google, and Meta compete to build the most advanced AI models, access to vast, diverse datasets has become a key differentiator. By licensing millions of images from Shutterstock, Apple aims to enhance its AI capabilities across a range of applications, from computer vision and image generation to virtual assistants and augmented reality.

Moreover, the willingness of tech giants to pay tens of millions of dollars for AI training data highlights the immense economic potential of the technology. As AI becomes increasingly integrated into products and services across industries, from healthcare and finance to entertainment and education, the market for AI-powered solutions is expected to grow exponentially in the coming years. By investing heavily in AI development now, companies like Apple are positioning themselves to capture a significant share of this burgeoning market.

Apple has declined to comment on the specific details of the Shutterstock deal. In a statement, the company said it is “committed to building AI systems in a thoughtful and ethical manner” and that it “respects the intellectual property rights of others.”

The rapidly evolving AI data market, estimated by Business Research Insights to be worth around $2.5 billion currently and projected to grow to nearly $30 billion within a decade, underscores the high stakes in the battle for AI supremacy among tech giants. As the industry grapples with the implications of this data gold rush, the long-term consequences for user privacy and data rights remain to be seen.