According to the lawsuit, OpenAI chose to "pursue profit at the expense of privacy, security, and ethics" by scouring the internet for troves of sensitive personal data, which it put into its large language models (LLMs) and deep language algorithms to create ChatGPT and DALL-E.
While semi-public information such as social network postings was allegedly gathered, more sensitive information such as keystrokes, personally identifying information (PII), financial data, biometrics, patient records, and browser cookies were also allegedly harvested.
The lawsuit also claims that OpenAI has access to large amounts of unknowing patients' medical data, aided by healthcare practitioners' enthusiasm to integrate an undeveloped chatbot into their practices. When a patient provides information about their medical issues to ChatGPT, such information is returned to the chatbot's LLM.
According to one of the complainants, she used a tool called Have I Been Trained to determine that private clinical photographs (used to document treatment for a genetic condition) had been extracted from her medical record and added to Common Crawl. This data repository boasts that it "can be accessed and analyzed by anyone." According to the lawsuit, her images were monetized without her knowledge by becoming a part of OpenAI's product offers.
Perhaps most shockingly, OpenAI took photographs of children online and used them to train DALL-E, a well-known image generator. According to reports, this data has made DALL-E popular for all the wrong reasons.
The present case raises serious concerns regarding the ethics of AI development and using personal data to train AI models. It emphasizes the importance of increased transparency and responsibility in how corporations developing AI technologies utilize personal data.
As artificial intelligence (AI) evolves rapidly, we must have open and honest dialogues about the ethical implications of its research and use. This case serves as a warning that we must preserve our personal information and ensure it is not misused.