v0.24 🌳  

Cloudflare Introduces One-Click Option to Block AI Data Scrapers

2024-07-04 05:37:14.436000

Perplexity AI, the California-based AI search startup, is facing backlash from publishers including The New York Times, The Guardian, Condé Nast, and Forbes for circumventing attempts to block their crawlers from accessing and serving up media content [934795e5]. Publishers are frustrated with Perplexity AI's unauthorized access and its disregard for the Robots Exclusion Protocol, which raises concerns about privacy and copyright infringement [934795e5] [0c7df256].

Perplexity AI, valued at $1 billion, plans to serve ads in the second half of this year and strike revenue-sharing deals with publishers [934795e5]. However, publishers are concerned about losing billions of dollars in ad revenue due to Perplexity AI's method of surfacing summaries and video content without proper authorization [934795e5]. The Guardian and Forbes have requested that Perplexity AI secure a commercial license for the use of their intellectual property and remove infringing articles [934795e5].

The unauthorized reproduction of content by Perplexity AI not only raises copyright concerns but also poses a threat to publishers' brands. The Times, Condé Nast, and Vogue have all blocked Perplexity AI's crawler, but the company continues to reproduce their content [934795e5]. Publishers argue that while deals with AI companies may provide short-term financial benefits, they can cause long-term damage to their brands [934795e5].

These controversies surrounding Perplexity AI's unauthorized access and copyright infringement come in addition to the concerns raised by the Wired investigation about the accuracy and trustworthiness of Perplexity AI's search results [0c7df256]. The investigation found instances where Perplexity AI provided incorrect answers and failed to cite reliable sources, further raising questions about the company's claims of prioritizing accurate answers and ranking trustworthy sources [0c7df256].

In a related development, Amazon is reviewing claims that Perplexity AI improperly scraped online content [b77e5d84]. Perplexity AI, which uses servers by Amazon Web Services (AWS), has been accused of scraping content without approval. Perplexity spokesperson Sara Platnick stated that the company is not violating AWS terms of service. Perplexity CEO Aravind Srinivas defended the company, stating that they do not rip off content and have made changes to highlight sources more prominently [b77e5d84].

Amazon is assessing the information received from WIRED, the news outlet that published the investigation. Perplexity AI has received funding from prominent tech investors, including Jeff Bezos [b77e5d84]. The company is based in San Francisco [b77e5d84].

Reddit is updating its policies to crack down on AI companies scraping the site for content to train AI models. The company will update its Robots Exclusion Protocol (robots.txt file) to give high-level instructions about how Reddit can and cannot be crawled by third parties. They will also continue rate-limiting and/or blocking unknown bots and crawlers from accessing the site. Good faith actors like researchers and organizations such as the Internet Archive will still have access to Reddit content for non-commercial use. Mark Graham, Director of Wayback Machine at Internet Archive, praised Reddit's position and expressed gratitude for their collaboration. Reddit emphasized that organizations must abide by their policies and provided a guide for accessing Reddit data. The update is expected to be implemented in the coming weeks [8d9d6ccf].

Cloudflare, a web infrastructure and security company, has introduced a new toggle switch to block AI bots from scraping website content. The feature is available to all Cloudflare customers, including free tier users. Website owners can protect their websites from AI bots by simply toggling the switch in the Cloudflare dashboard. Cloudflare uses machine learning to identify and block AI bots, and some of the most active bots include Bytespider and GPTBot. The company also offers an option for customers to report suspicious bot activity. The move aims to deter AI companies from circumventing rules and accessing content without permission [9d497c1a].

In another news, OptumRx has agreed to pay $20 million to settle allegations of mailing opioids alongside other medications without resolving red flags [95aaf919]. The settlement comes after an investigation into OptumRx's practices, which found that the company failed to adequately monitor and prevent the shipment of opioids to patients who did not need them. The investigation also revealed that OptumRx did not properly address red flags indicating potential misuse or diversion of opioids. As part of the settlement, OptumRx has agreed to implement enhanced compliance measures and monitoring to prevent future violations [95aaf919].

Disclaimer: The story curated or synthesized by the AI agents may not always be accurate or complete. It is provided for informational purposes only and should not be relied upon as legal, financial, or professional advice. Please use your own discretion.