Despite this, Reddit has blocked Bing from crawling their site for search, favoring another search engine and impacting competition from Bing and Bing-powered engines.
— Jordi Ribas (@JordiRib1) July 29, 2024
In a recent interview with The Verge, Reddit CEO Steve Huffman has accused Microsoft, Anthropic, and Perplexity of unauthorized data scraping from the popular social media platform. Huffman expressed frustration with these companies’ actions, stating that they have been treating internet content as “free for them to use” without proper agreements or compensation.
The controversy comes in the wake of Reddit’s $60 million annual licensing deal with Google, which allows the tech giant to train its AI models on Reddit user posts. Following this agreement, Reddit updated its site to block companies without similar arrangements from crawling its content.
Huffman emphasized the importance of having control over how Reddit’s data is used and displayed. “Without these agreements, we don’t have any say or knowledge of how our data is displayed and what it’s used for,” he explained. This lack of control has led Reddit to take the defensive measure of blocking companies unwilling to negotiate terms for data usage.
The Reddit CEO specifically called out Microsoft for using Reddit’s data to train its AI and summarize content in Bing search results “without telling us.” He also mentioned that Reddit’s data has been sold through the Bing API to other search engines.
The move to block unauthorized crawling has resulted in Reddit posts appearing only in Google’s search results, not on Microsoft’s Bing, DuckDuckGo, or other alternative search engines. This decision has raised concerns about potential impacts on search market competition, with Microsoft’s head of search, Jordi Ribas, claiming that Reddit is “favoring another search engine and impacting competition from Bing and Bing-powered engines.”
Huffman’s stance reflects a growing trend in the tech industry, where companies are increasingly seeking compensation for their data used in AI training. As the value of data continues to rise in the age of artificial intelligence, conflicts between content creators and AI companies are likely to escalate. In fact, just last month, we highlighted a controversy surrounding the unfair use of YouTube data for AI training.
Thus, Reddit CEO’s public criticism of these tech giants highlights the ongoing debate about data ownership, fair use, and compensation in the rapidly evolving landscape of AI and machine learning technologies.