Search This Blog

Powered by Blogger.

Blog Archive

Labels

Showing posts with label personalization. Show all posts

Social Media Content Fueling AI: How Platforms Are Using Your Data for Training

 

OpenAI has admitted that developing ChatGPT would not have been feasible without the use of copyrighted content to train its algorithms. It is widely known that artificial intelligence (AI) systems heavily rely on social media content for their development. In fact, AI has become an essential tool for many social media platforms.

For instance, LinkedIn is now using its users’ resumes to fine-tune its AI models, while Snapchat has indicated that if users engage with certain AI features, their content might appear in advertisements. Despite this, many users remain unaware that their social media posts and photos are being used to train AI systems.

Social Media: A Prime Resource for AI Training

AI companies aim to make their models as natural and conversational as possible, with social media serving as an ideal training ground. The content generated by users on these platforms offers an extensive and varied source of human interaction. Social media posts reflect everyday speech and provide up-to-date information on global events, which is vital for producing reliable AI systems.

However, it's important to recognize that AI companies are utilizing user-generated content for free. Your vacation pictures, birthday selfies, and personal posts are being exploited for profit. While users can opt out of certain services, the process varies across platforms, and there is no assurance that your content will be fully protected, as third parties may still have access to it.

How Social Platforms Are Using Your Data

Recently, the United States Federal Trade Commission (FTC) revealed that social media platforms are not effectively regulating how they use user data. Major platforms have been found to use personal data for AI training purposes without proper oversight.

For example, LinkedIn has stated that user content can be utilized by the platform or its partners, though they aim to redact or remove personal details from AI training data sets. Users can opt out by navigating to their "Settings and Privacy" under the "Data Privacy" section. However, opting out won’t affect data already collected.

Similarly, the platform formerly known as Twitter, now X, has been using user posts to train its chatbot, Grok. Elon Musk’s social media company has confirmed that its AI startup, xAI, leverages content from X users and their interactions with Grok to enhance the chatbot’s ability to deliver “accurate, relevant, and engaging” responses. The goal is to give the bot a more human-like sense of humor and wit.

To opt out of this, users need to visit the "Data Sharing and Personalization" tab in the "Privacy and Safety" settings. Under the “Grok” section, they can uncheck the box that permits the platform to use their data for AI purposes.

Regardless of the platform, users need to stay vigilant about how their online content may be repurposed by AI companies for training. Always review your privacy settings to ensure you’re informed and protected from unintended data usage by AI technologies

Is Your Android Device Tracking You? Understanding its Monitoring Methods

 

In general discussions about how Android phones might collect location and personal data, the focus often falls on third-party apps rather than Google's built-in apps. This awareness has grown due to numerous apps gathering significant information about users, leading to concerns, especially when targeted ads start appearing. The worry persists about whether apps, despite OS permissions, eavesdrop on private in-person conversations, a concern even addressed by Instagram's head in a 2019 CBS News interview.

However, attention to third-party apps tends to overshadow the fact that Android and its integrated apps track users extensively. While much of this tracking aligns with user preferences, it results in a substantial accumulation of sensitive personal data on phones. Even for those trusting Google with their information, understanding the collected data and its usage remains crucial, especially considering the limited options available to opt out of this data collection.

For instance, a lesser-known feature involves Google Assistant's ability to identify a parked car and send a notification regarding its location. This functionality, primarily guesswork, varies in accuracy and isn't widely publicized by Google, reflecting how tech companies leverage personal data for results that might raise concerns about potential eavesdropping.

The ways Android phones track users were highlighted in an October 2021 Kaspersky blog post referencing a study by researchers from the University of Edinburgh and Trinity College. While seemingly innocuous, the compilation of installed apps, when coupled with other personal data, can reveal intimate details about users, such as their religion or mental health status. This fusion of app presence with location data exposes highly personal information through AI-based assumptions.

Another focal point was the extensive collection of unique identifiers by Google and OEMs, tying users to specific handsets. While standard data collection aids app troubleshooting, these unique identifiers, including Google Advertising IDs, device serial numbers, and SIM card details, can potentially associate users even after phone number changes, factory resets, or ROM installations.

The study also emphasized the potential invasiveness of data collection methods, such as Xiaomi uploading app window histories and Huawei's keyboard logging app usage. Details like call durations and keyboard activity could lead to inferences about users' activities and health, reflecting the extensive and often unnoticed data collection practices by smartphones, as highlighted by Trinity College's Prof. Doug Leith.

Google Chrome Launches 'Privacy Sandbox' to Phase Out Tracking Cookies

 

Google has officially commenced the implementation of Privacy Sandbox within its Chrome web browser for a majority of its users. This move comes nearly four months after the initial announcement of the plan.

"We believe it is vital to both improve privacy and preserve access to information, whether it's news, a how-to-guide, or a fun video," Anthony Chavez, vice president of Privacy Sandbox initiatives at Google, said.

"Without viable privacy-preserving alternatives to third-party cookies, such as the Privacy Sandbox, we risk reducing access to information for all users, and incentivizing invasive tactics such as fingerprinting."

To facilitate thorough testing, the search giant has chosen to leave approximately three percent of users unaffected by the transition initially. Full availability is anticipated for all users in the upcoming months.

Privacy Sandbox serves as Google's comprehensive approach to a suite of technologies designed to replace third-party tracking cookies with privacy-conscious alternatives. This transition aims to maintain personalized content and advertisements while safeguarding user privacy.

Simultaneously, the company is in the beta testing phase of Privacy Sandbox on Android, extending it to eligible mobile devices running Android 13.

A pivotal component of this endeavor is the Topics API, which categorizes users into varying topics based on their site visitation frequency. Websites can utilize this API to discern a user's interests and deliver tailored ads without knowing the user's identity. Essentially, the web browser acts as an intermediary between the user and the website. Users also have the option to further customize their experience, including specifying ad topics of interest, enabling relevance and measurement APIs, or opting out entirely.

Despite its advancements, Privacy Sandbox has not been without criticism. The Movement For An Open Web recently pointed out that "Google gathers reams of personal data on each and every one of its users, sourced through an opt-in process that it's hard for most web users to avoid."

This development coincides with Google's efforts to enhance real-time protections against phishing attacks through enhancements to Safe Browsing, all without prior knowledge of users' browsing history.

While Google hasn't disclosed specific technical details, it has incorporated Oblivious HTTP relays (OHTTP relays) as part of Privacy Sandbox to enhance anonymity protections and mask IP address information.

"Previously, it worked by checking every site visit against a locally-stored list of known bad sites, which is updated every 30 to 60 minutes," Parisa Tabriz, vice president of Chrome, said.

"But phishing domains have gotten more sophisticated — and today, 60% of them exist for less than 10 minutes, making them difficult to block. By shortening the time between identification and prevention of threats, we expect to see 25% improved protection from malware and phishing threats."