Automattic, the parent company of websites like WordPress and Tumblr, is in negotiations to sell training-related content from its platforms to AI firms like MidJourney and OpenAI. Additionally, Automattic is trying to reassure users that they can opt-out at any time, even if the specifics of the agreement are yet unknown, according to a new report from 404 Media.
404 reports Automattic is experiencing internal disputes because private content not intended for the firm to save was among the items scrapped for AI companies. Further complicating matters, it was discovered that adverts from an earlier Apple Music campaign, as well as other non-Automatic commercial items, had made their way into the training data set.
Generative AI has grown in popularity since OpenAI introduced ChatGPT in late 2022, with a number of companies quickly following suit. The system works by being "trained" on massive volumes of data, allowing it to generate videos, images, and text that appear to be original. However, big publishers have protested, and some have even filed lawsuits, claiming that most of the data used to train these systems was either pirated or does not constitute "fair use" under existing copyright regimes.
Automattic intends to offer a new setting that would allow users to opt out of training AI systems, however it is unclear if the setting will be enabled or disabled by default for the majority of users. Last year, WordPress competitor Squarespace launched a similar choice that allows you to opt out of having your data used to train AI.
In response to emailed questions, Automattic directed local media to a new post that basically confirmed 404 Media's story, while also attempting to pitch the move to users as a chance to "give you more control over the content you've created.”
“AI is rapidly transforming nearly every aspect of our world, including the way we create and consume content. At Automattic, we’ve always believed in a free and open web and individual choice. Like other tech companies, we’re closely following these advancements, including how to work with AI companies in a way that respects our users’ preferences,” the blog post reads.
However, the lengthy statement comes across as incredibly defensive, noting that "no law exists that requires crawlers to follow these preferences," and implying that the company is simply following industry best practices by giving users the option of whether or not they want their content employed for AI training.