Reddit is capitalizing on the insatiable demand for data to power AI systems. Bloomberg is reporting that the social platform has inked an agreement allowing an unnamed "large AI company" to train models on Reddit content.
*Update Feb 22: Google has confirmed that it is expanding it's partnership with Reddit.
To enable these and other experiences, Google now has access to Reddit’s Data API, which delivers real-time, structured, unique content from their large and dynamic platform. With the Reddit Data API, Google will now have efficient and structured access to fresher information, as well as enhanced signals that will help us better understand Reddit content and display, train on, and otherwise use it in the most accurate and relevant ways.
The deal, reportedly worth about $60 million per year, signals Reddit's savvy in monetizing its vast trove of crowdsourced data. It also comes at an opportune time, as Reddit is advising banks it could pursue a multi-billion dollar initial public offering in the coming months.
This deal leverages surging investor appetite for opportunities in the AI space. As models like ChatGPT and Anthropic continue to capture headlines, many startups are trying to leverage the AI boom in some capacity to inflate their value.
Last year, many platforms including Reddit and X (formerly Twitter) revised their terms of use to prevent AI companies from scraping user data. Reddit also shifted its policy regarding access to its API, which will now involve a significant charge for companies seeking to use it.
Reddit boasts over 70 million daily active users (as of October) contributing written posts, images, videos and more across over 100,000 active communities. This cornucopia equips AI with invaluable training data to enhance machine learning and better mimic human language.
According to sources, Reddit brought in over $800 million in revenue last year —a 20% jump from 2022 ($666 million). While profitability remains lacking, the top line expansion and promising foray into AI monetization could breed bullishness for Reddit's IPO aspirations.
Foundation AI model providers are paying top dollar for diverse training data. Late last year, OpenAI reportedly agreed to a deal potentially worth eight figures with German publishing firm Axel Springer. These arrangements are not just about accessing vast amounts of data but also about ensuring that the AI models are fed with information that is accurate, relevant, and reflective of current trends.
As AI adoption grows, demand for model-enhancing data will likely persist. With quality information acting as rocket fuel, Reddit seems situated to profit from licensing out its data goldmine.