Search This Blog

Powered by Blogger.

Blog Archive

Labels

Showing posts with label AI Models. Show all posts

AI Models at Risk from TPUXtract Exploit

 


A team of researchers has demonstrated that it is possible to steal an artificial intelligence (AI) model without actually gaining access to the device that is running the model. The uniqueness of the technique lies in the fact that it works efficiently even if the thief may not have any prior knowledge as to how the AI works in the first place, or how the computer is structured. 

According to North Carolina State University's Department of Electrical and Computer Engineering, the method is known as TPUXtract, and it is a product of their department. With the help of a team of four scientists, who used high-end equipment and a technique known as "online template-building", they were able to deduce the hyperparameters of a convolutional neural network (CNN) running on Google Edge Tensor Processing Unit (TPU), which is the settings that define its structure and behaviour, with a 99.91% accuracy rate. 

The TPUXtract is an advanced side-channel attack technique devised by researchers at the North Carolina State University, designed to protect servers from attacks. A convolutional neural network (CNN) running on a Google Edge Tensor Processing Unit (TPU) is targeted in the attack, and electromagnetic signals are exploited to extract hyperparameters and configurations of the model without the need for previous knowledge of its architecture and software. 

A significant risk to the security of AI models and the integrity of intellectual property is posed by these types of attacks, which manifest themselves across three distinct phases, each of which is based on advanced methods to compromise the AI models' integrity. Attackers in the Profiling Phase observe and capture side-channel emissions produced by the target TPU as it processes known input data as part of the Profiling Phase. As a result, they have been able to decode unique patterns which correspond to specific operations such as convolutional layers and activation functions by using advanced methods like Differential Power Analysis (DPA) and Cache Timing Analysis. 

The Reconstruction Phase begins with the extraction and analysis of these patterns, and they are meticulously matched to known processing behaviours This enables adversaries to make an inference about the architecture of the AI model, including the layers that have been configured, the connections made, and the parameters that are relevant such as weight and bias. Through a series of repeated simulations and output comparisons, they can refine their understanding of the model in a way that enables precise reconstruction of the original model. 

Finally, the Validation Phase ensures that the replicated model is accurate. During the testing process, it is subject to rigorous testing with fresh inputs to ensure that it performs similarly to that of the original, thus providing reliable proof of its success. The threat that TPUXtract poses to intellectual property (IP) is composed of the fact that it enables attackers to steal and duplicate artificial intelligence models, bypassing the significant resources that are needed to develop them.

The competition could recreate and mimic models such as ChatGPT without having to invest in costly infrastructure or train their employees. In addition to IP theft, TPUXtract exposed cybersecurity risks by revealing an AI model's structure that provided visibility into its development and capabilities. This information could be used to identify vulnerabilities and enable cyberattacks, as well as expose sensitive data from a variety of industries, including healthcare and automotive.

Further, the attack requires specific equipment, such as a Riscure Electromagnetic Probe Station, high-sensitivity probes, and Picoscope oscilloscope, so only well-funded groups, for example, corporate competitors or state-sponsored actors, can execute it. As a result of the technical and financial requirements for the attack, it can only be executed by well-funded groups. With the understanding that any electronic device will emit electromagnetic radiation as a byproduct of its operations, the nature and composition of that radiation will be affected by what the device does. 

To conduct their experiments, the researchers placed an EM probe on top of the TPU after removing any obstructions such as cooling fans and centring it over the part of the chip emitting the strongest electromagnetic signals. The machine then emitted signals as a result of input data, and the signals were recorded. The researchers used the Google Edge TPU for this demonstration because it is a commercially available chip that is widely used to run AI models on edge devices meaning devices utilized by end users in the field, as opposed to AI systems that are used for database applications. During the demonstration, electromagnetic signals were monitored as a part of the technique used to conduct the demonstration.

A TPU chip was placed on top of a probe that was used by researchers to determine the structure and layer details of an AI model by recording changes in the electromagnetic field of the TPU during AI processing. The probe provided real-time data about changes in the electromagnetic field of the TPU during AI processing. To verify the model's electromagnetic signature, the researchers compared it to other signatures made by AI models made on a similar device - in this case, another Google Edge TPU. Using this technique, Kurian says, AI models can be stolen from a variety of different devices, including smartphones, tablets and computers. 

The attacker should be able to use this technique as long as they know the device from which they want to steal, have access to it while it is running an AI model, and have access to another device with similar specifications According to Kurian, the electromagnetic data from the sensor is essentially a ‘signature’ of the way AI processes information. There is a lot of work that goes into pulling off TPUXtract. The process not only requires a great deal of technical expertise, but it also requires a great deal of expensive and niche equipment as well. To scan the chip's surface, NCSU researchers used a Riscure EM probe station equipped with a motorized XYZ table, and a high-sensitivity electromagnetic probe to capture the weak signals emanating from it. 

It is said that the traces were recorded using a Picoscope 6000E oscilloscope, and Riscure's icWaves FPGA device aligned them in real-time, and the icWaves transceiver translated and filtered out the irrelevant signals using bandpass filters and AM/FM demodulation, respectively. While this may seem difficult and costly for a hacker to do on their own, Kurian explains, "It is possible for a rival company to do this within a couple of days, regardless of how difficult and expensive it will be. 

Taking the threat of TPUXtract into account, this model poses a formidable challenge to AI model security, highlighting the importance of proactive measures. As an organization, it is crucial to understand how such attacks work, implement robust defences, and ensure that they can safeguard their intellectual property while maintaining trust in their artificial intelligence systems. The AI and cybersecurity communities must learn continuously and collaborate to stay ahead of the changing threats as they arise.

The Role of Confidential Computing in AI and Web3

 

 
The rise of artificial intelligence (AI) has amplified the demand for privacy-focused computing technologies, ushering in a transformative era for confidential computing. At the forefront of this movement is the integration of these technologies within the AI and Web3 ecosystems, where maintaining privacy while enabling innovation has become a pressing challenge. A major event in this sphere, the DeCC x Shielding Summit in Bangkok, brought together more than 60 experts to discuss the future of confidential computing.

Pioneering Confidential Computing in Web3

Lisa Loud, Executive Director of the Secret Network Foundation, emphasized in her keynote that Secret Network has been pioneering confidential computing in Web3 since its launch in 2020. According to Loud, the focus now is to mainstream this technology alongside blockchain and decentralized AI, addressing concerns with centralized AI systems and ensuring data privacy.

Yannik Schrade, CEO of Arcium, highlighted the growing necessity for decentralized confidential computing, calling it the “missing link” for distributed systems. He stressed that as AI models play an increasingly central role in decision-making, conducting computations in encrypted environments is no longer optional but essential.

Schrade also noted the potential of confidential computing in improving applications like decentralized finance (DeFi) by integrating robust privacy measures while maintaining accessibility for end users. However, achieving a balance between privacy and scalability remains a significant hurdle. Schrade pointed out that privacy safeguards often compromise user experience, which can hinder broader adoption. He emphasized that for confidential computing to succeed, it must be seamlessly integrated so users remain unaware they are engaging with such technologies.

Shahaf Bar-Geffen, CEO of COTI, underscored the role of federated learning in training AI models on decentralized datasets without exposing raw data. This approach is particularly valuable in sensitive sectors like healthcare and finance, where confidentiality and compliance are critical.

Innovations in Privacy and Scalability

Henry de Valence, founder of Penumbra Labs, discussed the importance of aligning cryptographic systems with user expectations. Drawing parallels with secure messaging apps like Signal, he emphasized that cryptography should function invisibly, enabling users to interact with systems without technical expertise. De Valence stressed that privacy-first infrastructure is vital as AI’s capabilities to analyze and exploit data grow more advanced.

Other leaders in the field, such as Martin Leclerc of iEXEC, highlighted the complexity of achieving privacy, usability, and regulatory compliance. Innovative approaches like zero-knowledge proof technology, as demonstrated by Lasha Antadze of Rarimo, offer promising solutions. Antadze explained how this technology enables users to prove eligibility for actions like voting or purchasing age-restricted goods without exposing personal data, making blockchain interactions more accessible.

Dominik Schmidt, co-founder of Polygon Miden, reflected on lessons from legacy systems like Ethereum to address challenges in privacy and scalability. By leveraging zero-knowledge proofs and collaborating with decentralized storage providers, his team aims to enhance both developer and user experiences.

As confidential computing evolves, it is clear that privacy and usability must go hand in hand to address the needs of an increasingly data-driven world. Through innovation and collaboration, these technologies are set to redefine how privacy is maintained in AI and Web3 applications.

Securing Generative AI: Tackling Unique Risks and Challenges

 

Generative AI has introduced a new wave of technological innovation, but it also brings a set of unique challenges and risks. According to Phil Venables, Chief Information Security Officer of Google Cloud, addressing these risks requires expanding traditional cybersecurity measures. Generative AI models are prone to issues such as hallucinations—where the model produces inaccurate or nonsensical content—and the leaking of sensitive information through model outputs. These risks necessitate the development of tailored security strategies to ensure safe and reliable AI use. 

One of the primary concerns with generative AI is data integrity. Models rely heavily on vast datasets for training, and any compromise in this data can lead to significant security vulnerabilities. Venables emphasizes the importance of maintaining the provenance of training data and implementing controls to protect its integrity. Without proper safeguards, models can be manipulated through data poisoning, which can result in the production of biased or harmful outputs. Another significant risk involves prompt manipulation, where adversaries exploit vulnerabilities in the AI model to produce unintended outcomes. 

This can include injecting malicious prompts or using adversarial tactics to bypass the model’s controls. Venables highlights the necessity of robust input filtering mechanisms to prevent such manipulations. Organizations should deploy comprehensive logging and monitoring systems to detect and respond to suspicious activities in real time. In addition to securing inputs, controlling the outputs of AI models is equally critical. Venables recommends the implementation of “circuit breakers”—mechanisms that monitor and regulate model outputs to prevent harmful or unintended actions. This ensures that even if an input is manipulated, the resulting output is still within acceptable parameters. Infrastructure security also plays a vital role in safeguarding generative AI systems. 

Venables advises enterprises to adopt end-to-end security practices that cover the entire lifecycle of AI deployment, from model training to production. This includes sandboxing AI applications, enforcing the least privilege principle, and maintaining strict access controls on models, data, and infrastructure. Ultimately, securing generative AI requires a holistic approach that combines innovative security measures with traditional cybersecurity practices. 

By focusing on data integrity, robust monitoring, and comprehensive infrastructure controls, organizations can mitigate the unique risks posed by generative AI. This proactive approach ensures that AI systems are not only effective but also safe and trustworthy, enabling enterprises to fully leverage the potential of this groundbreaking technology while minimizing associated risks.

Zero-Trust Log Intelligence: Safeguarding Data with Secure Access

 


Over the years, zero trust has become a popular model adopted by organisations due to a growing need to ensure confidential information is kept safe, an aspect that organisations view as paramount in cybersecurity. Zero-trust is a vital security framework that is fundamentally not like the traditional security perimeter-based model. Instead of relying on a robust boundary, zero-trust grants access to its resources after the constant validation of any user and every device they use, regardless of an individual's position within an organisation or the number of years since one first employed with the company. This "never trust, always verify" policy only grants minimum access to someone, even a long-tenured employee, about what is needed to fulfil their tasks. Because information for cybersecurity is often log file data, zero trust principles can provide better safeguarding of this sensitive information.

Log Files: Why They Are Both Precious and Vulnerable

Log files contain information that reflects all the digital interplay happening on the network, hence can indicate any vulnerability on a system for remediation purposes. For example, it's a good source where one will trace how companies' activities go regarding their performance by analysing log files for anything out of place or anomalies in systems' behaviours for speedy intervention for security lapses. At the same time, however, these log files can expose organisations to vulnerabilities when wrong hands gain access because of possible theft of confidential data or the intention of hacking or modification. The log files have to be strictly controlled and limited only for authorization, because the misuse has to be avoided for maintaining the network secure.

Collecting and Storing Log Data Securely

Zero trust can best be implemented only if gathering and storing of log file collection and storage are sound. It ensures that the real-time data is collected in an environment that has a tamper-resistant place that prevents data from unauthorised modification. Of late, there has been OpenTelemetry, which is gaining popularity due to its potential in the multiple data sources and secure integration with many databases, mostly PostgreSQL.

Secure log storage applies blockchain technology. A decentralised, immutable structure like blockchain ensures logs cannot be altered and their records will remain transparent as well as tamper-proof. The reason blockchain technology works through multiple nodes rather than one central point makes it nearly impossible to stage a focused attack on the log data.

Imposing Least Privilege Access Control

Least privilege access would be one of the greatest principles of zero-trust security, which means that end-users would have only access to what is required to achieve their task. However, it can be challenging when balancing this principle with being efficient in log analysis; traditional access control methods-such as data masking or classification-frequently fall short and are not very practical. One promising solution to this problem is homomorphic encryption, which enables analysis of data in its encrypted state. Analysts can evaluate log files without ever directly seeing the unencrypted data, ensuring that security is maintained without impacting workflow.

Homomorphic encryption is beyond the level of the analyst. This means other critical stakeholders, such as administrators, have access to permissions but are not allowed to read actual data. This means logs are going to be secure at internal teams and thus there is a lesser chance of accidental exposure.

In-House AI for Threat Detection

Companies can further secure log data by putting in-house AI models which are run directly within their database and hence minimise external access. For instance, the company can use a private SLM AI that was trained specifically to analyse the logs. This ensures there is safe and accurate threat detection without having to share any logs with third-party services. The other advantage that an AI trained on relevant log data provides is less bias, as all operations depend on only relevant encrypted log data that can give an organisation precise and relevant insights.

Organisations can ensure maximum security while minimising exposure to potential cyber threats by applying a zero-trust approach through strict access controls and keeping data encrypted all through the analysis process.

Zero-Trust for Optimal Log Security

One of the effective log file intelligence approaches appears to be zero trust security-a security approach that uses the technologies of blockchain and homomorphic encryption to ensure the integrity and privacy of information in management. It means one locks up logs, and it is a source for valuable security insights, kept well protected against unauthorised access and modifications.

Even if an organisation does not adopt zero-trust completely for its systems, it should still ensure that the protection of the logs is considered a priority. By taking the essential aspects of zero-trust, such as having minimal permissions and secured storage, it can help organisations decrease their vulnerability to cyber attacks while protecting this critical source of data.




Managing LLM Security Risks in Enterprises: Preventing Insider Threats

 

Large language models (LLMs) are transforming enterprise automation and efficiency but come with significant security risks. These AI models, which lack critical thinking, can be manipulated to disclose sensitive data or even trigger actions within integrated business systems. Jailbreaking LLMs can lead to unauthorized access, phishing, and remote code execution vulnerabilities. Mitigating these risks requires strict security protocols, such as enforcing least privilege, limiting LLM actions, and sanitizing input and output data. LLMs in corporate environments pose threats because they can be tricked into sharing sensitive information or be used to trigger harmful actions within systems. 

Unlike traditional tools, their intelligent, responsive nature can be exploited through jailbreaking—altering the model’s behavior with crafted prompts. For instance, LLMs integrated with a company’s financial system could be compromised, leading to data manipulation, phishing attacks, or broader security vulnerabilities such as remote code execution. The severity of these risks grows when LLMs are deeply integrated into essential business operations, expanding potential attack vectors. In some cases, threats like remote code execution (RCE) can be facilitated by LLMs, allowing hackers to exploit weaknesses in frameworks like LangChain. This not only threatens sensitive data but can also lead to significant business harm, from financial document manipulation to broader lateral movement within a company’s systems.  

Although some content-filtering and guardrails exist, the black-box nature of LLMs makes specific vulnerabilities challenging to detect and fix through traditional patching. Meta’s Llama Guard and other similar tools provide external solutions, but a more comprehensive approach is needed to address the underlying risks posed by LLMs. To mitigate the risks, companies should enforce strict security measures. This includes applying the principle of least privilege—restricting LLM access and functionality to the minimum necessary for specific tasks—and avoiding reliance on LLMs as a security perimeter. 

Organizations should also ensure that input data is sanitized and validate all outputs for potential threats like cross-site scripting (XSS) attacks. Another important measure is limiting the actions that LLMs can perform, preventing them from mimicking end-users or executing actions outside their intended purpose. For cases where LLMs are used to run code, employing a sandbox environment can help isolate the system and protect sensitive data. 

While LLMs bring incredible potential to enterprises, their integration into critical systems must be carefully managed. Organizations need to implement robust security measures, from limiting access privileges to scrutinizing training data and ensuring that sensitive data is protected. This strategic approach will help mitigate the risks associated with LLMs and reduce the chance of exploitation by malicious actors.

Want to Make the Most of ChatGPT? Here Are Some Go-To Tips

 







Within a year and a half, ChatGPT has grown from an AI prototype to a broad productivity assistant, even sporting its text and code editor called Canvas. Soon, OpenAI will add direct web search capability to ChatGPT, putting the platform at the same table as Google's iconic search. With these fast updates, ChatGPT is now sporting quite a few features that may not be noticed at first glance but are deepening the user experience if one knows where to look.

This is the article that will teach you how to tap into ChatGPT, features from customization settings to unique prompting techniques, and not only five must-know tips will be useful in unlocking the full range of abilities of ChatGPT to any kind of task, small or big.


1. Rename Chats for Better Organisation

A new conversation with ChatGPT begins as a new thread, meaning that it will remember all details concerning that specific exchange but "forget" all the previous ones. This way, you can track the activities of current projects or specific topics because you can name your chats. The chat name that it might try to suggest is related to the flow of the conversation, and these are mostly overlooked contexts that users need to recall again. Renaming your conversations is one simple yet powerful means of staying organised if you rely on ChatGPT for various tasks.

To give a name to a conversation, tap the three dots next to the name in the sidebar. You can also archive older chats to remove them from the list without deleting them entirely, so you don't lose access to the conversations that are active.


2. Customise ChatGPT through Custom Instructions

Custom Instructions in ChatGPT is a chance to make your answers more specific to your needs because you will get to share your information and preferences with the AI. This is a two-stage personalization where you are explaining to ChatGPT what you want to know about yourself and, in addition, how you would like it to be returned. For instance, if you ask ChatGPT for coding advice several times a week, you can let the AI know what programming languages you are known in or would like to be instructed in so it can fine-tune the responses better. Or, you should be able to ask for ChatGPT to provide more verbose descriptions or to skip steps in order to make more intuitive knowledge of a topic.

To set up personal preferences, tap the profile icon on the upper right, and then from the menu, "Customise ChatGPT," and then fill out your preferences. Doing this will enable you to get responses tailored to your interests and requirements.


3. Choose the Right Model for Your Use

If you are a subscriber to ChatGPT Plus, you have access to one of several AI models each tailored to different tasks. The default model for most purposes is GPT-4-turbo (GPT-4o), which tends to strike the best balance between speed and functionality and even supports other additional features, including file uploads, web browsing, and dataset analysis.

However, other models are useful when one needs to describe a rather complex project with substantial planning. You may initiate a project using o1-preview that requires deep research and then shift the discussion to GPT-4-turbo to get quick responses. To switch models, you can click on the model dropdown at the top of your screen or type in a forward slash (/) in the chat box to get access to more available options including web browsing and image creation.


4. Look at what the GPT Store has available in the form of Mini-Apps

Custom GPTs, and the GPT Store enable "mini-applications" that are able to extend the functionality of the platform. The Custom GPTs all have some inbuilt prompts and workflows and sometimes even APIs to extend the AI capability of GPT. For instance, with Canva's GPT, you are able to create logos, social media posts, or presentations straight within the ChatGPT portal by linking up the Canva tool. That means you can co-create visual content with ChatGPT without having to leave the portal.

And if there are some prompts you often need to apply, or some dataset you upload most frequently, you can easily create your Custom GPT. This would be really helpful to handle recipes, keeping track of personal projects, create workflow shortcuts and much more. Go to the GPT Store by the "Explore GPTs" button in the sidebar. Your recent and custom GPTs will appear in the top tab, so find them easily and use them as necessary.


5. Manage Conversations with a Fresh Approach

For the best benefit of using ChatGPT, it is key to understand that every new conversation is an independent document with its "memory." It does recall enough from previous conversations, though generally speaking, its answers depend on what is being discussed in the immediate chat. This made chats on unrelated projects or topics best started anew for clarity.

For long-term projects, it might even be logical to go on with a single thread so that all relevant information is kept together. For unrelated topics, it might make more sense to start fresh each time to avoid confusion. Another way in which archiving or deleting conversations you no longer need can help free up your interface and make access to active threads easier is


What Makes AI Unique Compared to Other Software?

AI performs very differently from other software in that it responds dynamically, at times providing responses or "backtalk" and does not simply do what it is told to do. Such a property leads to some trial and error to obtain the desired output. For instance, one might prompt ChatGPT to review its own output as demonstrated by replacing single quote characters by double quote characters to generate more accurate results. This is similar to how a developer optimises an AI model, guiding ChatGPT to "think" through something in several steps.

ChatGPT Canvas and other features like Custom GPTs make the AI behave more like software in the classical sense—although, of course, with personality and learning. If ChatGPT continues to grow in this manner, features such as these may make most use cases easier and more delightful.

Following these five tips should help you make the most of ChatGPT as a productivity tool and keep pace with the latest developments. From renaming chats to playing around with Custom GPTs, all of them add to a richer and more customizable user experience.


Data Poisoning: The Hidden Threat to AI Models



As ongoing developments in the realms of artificial intelligence and machine learning take place at a dynamic rate, yet another new form of attack is emerging, one which can topple all those systems we use today without much ado: data poisoning. This type of attack involves tampering with data used by AI models in training to make them malfunction, often undetectably. The issue came to light when recently, more than 100 malicious models were uncovered on the popular repository for AI, Hugging Face, by a software management company, JFrog. 

What is Data Poisoning?

Data poisoning is an attack method on AI models by corrupting the data used for its training. In other words, the intent is to have the model make inappropriate predictions or choices. Besides, unlike traditional hacking, it doesn't require access to the system; therefore, data poisoning manipulates input data either before the deployment of an AI model or after the deployment of the AI model, and that makes it very difficult to detect.

One attack happens at the training phase when an attacker manages to inject malicious data into any AI model. Yet another attack happens post-deployment when poisoned data is fed to the AI; it yields wrong outputs. Both kinds of attacks remain hardly detectable and cause damage to the AI system in the long run.

According to research by JFrog, investigators found a number of suspicious models uploaded to Hugging Face, a community where users can share AI models. Those contained encoded malicious code, which the researchers believe hackers-those potentially coming from the KREOnet research network in Korea-might have embedded. The most worrying aspect, however, was the fact that these malicious models went undetected by masquerading as benign.

That's a serious threat because many AI systems today use a great amount of data from different sources, including the internet. In cases where attackers manage to change the data used in the training of a model, that could mean anything from misleading results to actual large-scale cyberattacks.

Why It's Hard to Detect

One of the major challenges with data poisoning is that AI models are built by using enormous data sets, which makes it difficult for researchers to always know what has gone into the model. A lack of clarity of this kind in turn creates ways in which attackers can sneak in poisoned data without being caught.

But it gets worse: AI systems that scrape data from the web continuously in order to update themselves could poison their own training data. This sets up the alarming possibility of an AI system's gradual breakdown, or "degenerative model collapse."

The Consequences of Ignoring the Threat

If left unmitigated, data poisoning could further allow attackers to inject stealth backdoors in AI software that enable them to conduct malicious actions or cause any AI system to behave in ways unexpected. Precisely, they can run malicious code, allow phishing, and rig AI predictions for various nefarious uses.

The cybersecurity industry must take this as a serious threat since more dependence occurs on generative AI linked together, alongside LLMs. If one fails to do so, widespread vulnerability across the complete digital ecosystem will result.

How to Defend Against Data Poisoning

The protection of AI models against data poisoning calls for vigilance throughout the process of the AI development cycle. Experts say that this may require oversight by organisations in using only data from sources they can trust for training the AI model. The Open Web Application Security Project, or OWASP, has provided a list of some best ways to avoid data poisoning; a few of these include frequent checks to find biases and abnormalities during the training of data.

Other recommendations come in the form of multiple AI algorithms that verify results against each other to locate inconsistency. If an AI model starts producing strange results, fallback mechanisms should be in place to prevent any harm.

This also encompasses simulated data poisoning attacks run by cybersecurity teams to test their AI systems for robustness. While it is hard to build an AI system that is 100% secure, frequent validation of predictive outputs goes a long way in detecting and preventing poisoning.

Creating a Secure Future for AI

While AI keeps evolving, there is a need to instil trust in such systems. This will only be possible when the entire ecosystem of AI, even the supply chains, forms part of the cybersecurity framework. This would be achievable through monitoring inputs and outputs against unusual or irregular AI systems. Therefore, organisations will build robust, and more trustworthy models of AI.

Ultimately, the future of AI hangs in the balance with our capability to race against emerging threats like data poisoning. In sum, the ability of businesses to proactively take steps toward the security of AI systems today protects them from one of the most serious challenges facing the digital world.

The bottom line is that AI security is not just about algorithms; it's about the integrity for the data powering those algorithms.


 

Irish Data Protection Commission Halts AI Data Practices at X

 

The Irish Data Protection Commission (DPC) recently took a decisive step against the tech giant X, resulting in the immediate suspension of its use of personal data from European Union (EU) and European Economic Area (EEA) users to train its AI model, “Grok.” This marks a significant victory for data privacy, as it is the first time the DPC has taken such substantial action under its powers granted by the Data Protection Act of 2018. 

The DPC initially raised concerns that X’s data practices posed a considerable risk to individuals’ fundamental rights and freedoms. The use of publicly available posts to train the AI model was viewed as an unauthorized collection of sensitive personal data without explicit consent. This intervention highlights the tension between technological innovation and the necessity of safeguarding individual privacy. 

Following the DPC’s intervention, X agreed to cease its current data processing activities and commit to adhering to stricter privacy guidelines. Although the company did not acknowledge any wrongdoing, this outcome sends a strong message to other tech firms about the importance of prioritizing data privacy when developing AI technologies. The immediate halt of Grok AI’s training on data from 60 million European users came in response to mounting regulatory pressure across Europe, with at least nine GDPR complaints filed during its short stint from May 7 to August 1. 

After the suspension, Dr. Des Hogan, Chairperson of the Irish DPC, emphasized that the regulator would continue working with its EU/EEA peers to ensure compliance with GDPR standards, affirming the DPC’s commitment to safeguarding citizens’ rights. The DPC’s decision has broader implications beyond its immediate impact on X. As AI technology rapidly evolves, questions about data ethics and transparency are increasingly urgent. This decision serves as a prompt for a necessary dialogue on the responsible use of personal data in AI development.  

To further address these issues, the DPC has requested an opinion from the European Data Protection Board (EDPB) regarding the legal basis for processing personal data in AI models, the extent of data collection permitted, and the safeguards needed to protect individual rights. This guidance is anticipated to set clearer standards for the responsible use of data in AI technologies. The DPC’s actions represent a significant step in regulating AI development, aiming to ensure that these powerful technologies are deployed ethically and responsibly. By setting a precedent for data privacy in AI, the DPC is helping shape a future where innovation and individual rights coexist harmoniously.

Tech Giants Face Backlash Over AI Privacy Concerns






Microsoft recently faced material backlash over its new AI tool, Recall, leading to a delayed release. Recall, introduced last month as a feature of Microsoft's new AI companion, captures screen images every few seconds to create a searchable library. This includes sensitive information like passwords and private conversations. The tool's release was postponed indefinitely after criticism from data privacy experts, including the UK's Information Commissioner's Office (ICO).

In response, Microsoft announced changes to Recall. Initially planned for a broad release on June 18, 2024, it will first be available to Windows Insider Program users. The company assured that Recall would be turned off by default and emphasised its commitment to privacy and security. Despite these assurances, Microsoft declined to comment on claims that the tool posed a security risk.

Recall was showcased during Microsoft's developer conference, with Yusuf Mehdi, Corporate Vice President, highlighting its ability to access virtually anything on a user's PC. Following its debut, the ICO vowed to investigate privacy concerns. On June 13, Microsoft announced updates to Recall, reinforcing its "commitment to responsible AI" and privacy principles.

Adobe Overhauls Terms of Service 

Adobe faced a wave of criticism after updating its terms of service, which many users interpreted as allowing the company to use their work for AI training without proper consent. Users were required to agree to a clause granting Adobe a broad licence over their content, leading to suspicions that Adobe was using this content to train generative AI models like Firefly.

Adobe officials, including President David Wadhwani and Chief Trust Officer Dana Rao, denied these claims and clarified that the terms were misinterpreted. They reassured users that their content would not be used for AI training without explicit permission, except for submissions to the Adobe Stock marketplace. The company acknowledged the need for clearer communication and has since updated its terms to explicitly state these protections.

The controversy began with Firefly's release in March 2023, when artists noticed AI-generated imagery mimicking their styles. Users like YouTuber Sasha Yanshin cancelled their Adobe subscriptions in protest. Adobe's Chief Product Officer, Scott Belsky, admitted the wording was unclear and emphasised the importance of trust and transparency.

Meta Faces Scrutiny Over AI Training Practices

Meta, the parent company of Facebook and Instagram, has also been criticised for using user data to train its AI tools. Concerns were raised when Martin Keary, Vice President of Product Design at Muse Group, revealed that Meta planned to use public content from social media for AI training.

Meta responded by assuring users that it only used public content and did not access private messages or information from users under 18. An opt-out form was introduced for EU users, but U.S. users have limited options due to the lack of national privacy laws. Meta emphasised that its latest AI model, Llama 2, was not trained on user data, but users remain concerned about their privacy.

Suspicion arose in May 2023, with users questioning Meta's security policy changes. Meta's official statement to European users clarified its practices, but the opt-out form, available under Privacy Policy settings, remains a complex process. The company can only address user requests if they demonstrate that the AI "has knowledge" of them.

The recent actions by Microsoft, Adobe, and Meta highlight the growing tensions between tech giants and their users over data privacy and AI development. As these companies navigate user concerns and regulatory scrutiny, the debate over how AI tools should handle personal data continues to intensify. The tech industry's future will heavily depend on balancing innovation with ethical considerations and user trust.


Slack Faces Backlash Over AI Data Policy: Users Demand Clearer Privacy Practices

 

In February, Slack introduced its AI capabilities, positioning itself as a leader in the integration of artificial intelligence within workplace communication. However, recent developments have sparked significant controversy. Slack's current policy, which collects customer data by default for training AI models, has drawn widespread criticism and calls for greater transparency and clarity. 

The issue gained attention when Gergely Orosz, an engineer and writer, pointed out that Slack's terms of service allow the use of customer data for training AI models, despite reassurances from Slack engineers that this is not the case. Aaron Maurer, a Slack engineer, acknowledged the need for updated policies that explicitly detail how Slack AI interacts with customer data. This discrepancy between policy language and practical application has left many users uneasy. 

Slack's privacy principles state that customer data, including messages and files, may be used to develop AI and machine learning models. In contrast, the Slack AI page asserts that customer data is not used to train Slack AI models. This inconsistency has led users to demand that Slack update its privacy policies to reflect the actual use of data. The controversy intensified as users on platforms like Hacker News and Threads voiced their concerns. Many felt that Slack had not adequately notified users about the default opt-in for data sharing. 

The backlash prompted some users to opt out of data sharing, a process that requires contacting Slack directly with a specific request. Critics argue that this process is cumbersome and lacks transparency. Salesforce, Slack's parent company, has acknowledged the need for policy updates. A Salesforce spokesperson stated that Slack would clarify its policies to ensure users understand that customer data is not used to train generative AI models and that such data never leaves Slack's trust boundary. 

However, these changes have yet to address the broader issue of explicit user consent. Questions about Slack's compliance with the General Data Protection Regulation (GDPR) have also arisen. GDPR requires explicit, informed consent for data collection, which must be obtained through opt-in mechanisms rather than default opt-ins. Despite Slack's commitment to GDPR compliance, the current controversy suggests that its practices may not align fully with these regulations. 

As more users opt out of data sharing and call for alternative chat services, Slack faces mounting pressure to revise its data policies comprehensively. This situation underscores the importance of transparency and user consent in data practices, particularly as AI continues to evolve and integrate into everyday tools. 

The recent backlash against Slack's AI data policy highlights a crucial issue in the digital age: the need for clear, transparent data practices that respect user consent. As Slack works to update its policies, the company must prioritize user trust and regulatory compliance to maintain its position as a trusted communication platform. This episode serves as a reminder for all companies leveraging AI to ensure their data practices are transparent and user-centric.

Teaching AI Sarcasm: The Next Frontier in Human-Machine Communication

In a remarkable breakthrough, a team of university researchers in the Netherlands has developed an artificial intelligence (AI) platform capable of recognizing sarcasm. According to a report from The Guardian, the findings were presented at a meeting of the Acoustical Society of America and the Canadian Acoustical Association in Ottawa, Canada. During the event, Ph.D. student Xiyuan Gao detailed how the research team utilized video clips, text, and audio content from popular American sitcoms such as "Friends" and "The Big Bang Theory" to train a neural network. 

The foundation of this innovative work is a database known as the Multimodal Sarcasm Detection Dataset (MUStARD). This dataset, annotated by a separate research team from the U.S. and Singapore, includes labels indicating the presence of sarcasm in various pieces of content. By leveraging this annotated dataset, the Dutch research team aimed to construct a robust sarcasm detection model. 

After extensive training using the MUStARD dataset, the researchers achieved an impressive accuracy rate. The AI model could detect sarcasm in previously unlabeled exchanges nearly 75% of the time. Further developments in the lab, including the use of synthetic data, have reportedly improved this accuracy even more, although these findings are yet to be published. 

One of the key figures in this project, Matt Coler from the University of Groningen's speech technology lab, expressed excitement about the team's progress. "We are able to recognize sarcasm in a reliable way, and we're eager to grow that," Coler told The Guardian. "We want to see how far we can push it." Shekhar Nayak, another member of the research team, highlighted the practical applications of their findings. 

By detecting sarcasm, AI assistants could better interact with human users, identifying negativity or hostility in speech. This capability could significantly enhance the user experience by allowing AI to respond more appropriately to human emotions and tones. Gao emphasized that integrating visual cues into the AI tool's training data could further enhance its effectiveness. By incorporating facial expressions such as raised eyebrows or smirks, the AI could become even more adept at recognizing sarcasm. 

The scenes from sitcoms used to train the AI model included notable examples, such as a scene from "The Big Bang Theory" where Sheldon observes Leonard's failed attempt to escape a locked room, and a "Friends" scene where Chandler, Joey, Ross, and Rachel unenthusiastically assemble furniture. These diverse scenarios provided a rich source of sarcastic interactions for the AI to learn from. The research team's work builds on similar efforts by other organizations. 

For instance, the U.S. Department of Defense's Defense Advanced Research Projects Agency (DARPA) has also explored AI sarcasm detection. Using DARPA's SocialSim program, researchers from the University of Central Florida developed an AI model that could classify sarcasm in social media posts and text messages. This model achieved near-perfect sarcasm detection on a major Twitter benchmark dataset. DARPA's work underscores the broader significance of accurately detecting sarcasm. 

"Knowing when sarcasm is being used is valuable for teaching models what human communication looks like and subsequently simulating the future course of online content," DARPA noted in a 2021 report. The advancements made by the University of Groningen team mark a significant step forward in AI's ability to understand and interpret human communication. 

As AI continues to evolve, the integration of sarcasm detection could play a crucial role in developing more nuanced and responsive AI systems. This progress not only enhances human-AI interaction but also opens new avenues for AI applications in various fields, from customer service to mental health support.

Microsoft Temporarily Blocks ChatGPT: Addressing Data Concerns

Microsoft recently made headlines by temporarily blocking internal access to ChatGPT, a language model developed by OpenAI, citing data concerns. The move sparked curiosity and raised questions about the security and potential risks associated with this advanced language model.

According to reports, Microsoft took this precautionary step on Thursday, sending ripples through the tech community. The decision came as a response to what Microsoft referred to as data concerns associated with ChatGPT.

While the exact nature of these concerns remains undisclosed, it highlights the growing importance of scrutinizing the security aspects of AI models, especially those that handle sensitive information. With ChatGPT being a widely used language model for various applications, including customer service and content generation, any potential vulnerabilities in its data handling could have significant implications.

As reported by ZDNet, Microsoft still needs to provide detailed information on the duration of the block or the specific data issues that prompted this action. However, the company stated that it is actively working with OpenAI to address these concerns and ensure a secure environment for its users.

This incident brings to light the continuous difficulties and obligations involved in applying cutting-edge AI models to practical situations. It is crucial to guarantee the security and moral application of these models as artificial intelligence gets more and more integrated into different businesses. Businesses must find a balance between protecting sensitive data and utilizing AI's potential.

It's important to note that instances like this add to the continuing discussion about AI ethics and the necessity of open disclosure about possible dangers. The tech titans' dedication to rapidly and ethically addressing issues is demonstrated by their partnership in tackling the data concerns through OpenAI and Microsoft.

Microsoft's recent decision to temporarily restrict internal access to ChatGPT highlights the dynamic nature of AI security and the significance of exercising caution while implementing sophisticated language models. The way the problem develops serves as a reminder that, in order to guarantee the ethical and secure use of AI technology, the tech community needs to continue being proactive in addressing possible data vulnerabilities.





Customized AI Models and Benchmarks: A Path to Ethical Deployment

 

As artificial intelligence (AI) models continue to advance, the need for industry collaboration and tailored testing benchmarks becomes increasingly crucial for organizations in their quest to find the right fit for their specific needs.

Ong Chen Hui, the assistant chief executive of the business and technology group at Infocomm Media Development Authority (IMDA), emphasized the importance of such efforts. As enterprises seek out large language models (LLMs) customized for their verticals and countries aim to align AI models with their unique values, collaboration and benchmarking play key roles.

Ong raised the question of whether relying solely on one large foundation model is the optimal path forward, or if there is a need for more specialized models. She pointed to Bloomberg's initiative to develop BloombergGPT, a generative AI model specifically trained on financial data. Ong stressed that as long as expertise, data, and computing resources remain accessible, the industry can continue to propel developments forward.

Red Hat, a software vendor and a member of Singapore's AI Verify Foundation, is committed to fostering responsible and ethical AI usage. The foundation aims to leverage the open-source community to create test toolkits that guide the ethical deployment of AI. Singapore boasts the highest adoption of open-source technologies in the Asia-Pacific region, with numerous organizations, including port operator PSA Singapore and UOB bank, using Red Hat's solutions to enhance their operations and cloud development.

Transparency is a fundamental aspect of AI ethics, according to Ong. She emphasized the importance of open collaboration in developing test toolkits, citing cybersecurity as a model where open-source development has thrived. Ong highlighted the need for continuous testing and refinement of generative AI models to ensure they align with an organization's ethical guidelines.

However, some concerns have arisen regarding major players like OpenAI withholding technical details about their LLMs. A group of academics from the University of Oxford highlighted issues related to accessibility, replicability, reliability, and trustworthiness (AART) stemming from the lack of information about these models.

Ong suggested that organizations adopting generative AI will fall into two camps: those opting for proprietary large language AI models and those choosing open-source alternatives. She emphasized that businesses focused on transparency can select open-source options.

As generative AI applications become more specialized, customized test benchmarks will become essential. Ong stressed that these benchmarks will be crucial for testing AI applications against an organization's or country's AI principles, ensuring responsible and ethical deployment.

In conclusion, the collaboration, transparency, and benchmarking efforts in the AI industry are essential to cater to specific needs and align AI models with ethical and responsible usage. The development of specialized generative AI models and comprehensive testing benchmarks will be pivotal in achieving these objectives.

Why is Skepticism the Best Protection When Adopting Generative AI?


It has become crucial for companies to implement generative artificial intelligence (AI) while minimizing potential hazards and with a healthy dose of skepticism. 

According to a Gartner report issued on Tuesday, 45% of firms are presently testing generative AI, while 10% have such technologies in use. During a webinar last month to examine the commercial costs and dangers of generative AI, 1,419 executives were polled.

In the recent survey, around 78% said that the advantages of generative AI exceeded its risks, compared to the 68% who felt the same way in the prior survey. 

According to Gartner, 22% of firms are expanding their generative AI investments across at least three different functions, with 45% of businesses doing so overall. Software development saw the biggest investment in or adoption of generative AI, at 21%, followed by marketing and customer service, at 19% and 16%, respectively.

Gartner’s group chief of research and an acclaimed analyst, "Organizations are not just talking about generative AI – they're investing time, money, and resources to move it forward and drive business outcomes."

"Executives are taking a bolder stance on generative AI as they see the profound ways that it can drive innovation, optimization, and disruption[…]Business and IT leaders understand that the 'wait and see' approach is riskier than investing," said Karamouzis.

Why is ‘Having a Doubt’ Necessary 

In order to grow their businesses companies must have a framework in place to ensure that they are adopting generative AI responsibly and ethically.

According to Kathy Baxter, Salesforce.com's principal architect of Responsible AI, skepticism should also be extended to technologies that can tell whether AI has been deployed.

Baxter further added that technology has now become ‘democratized,’ allowing anyone to have access to generative AI without many restrictions. However, despite the fact that many firms are making an attempt to screen out harmful information and are still investing in such initiatives, there is still a lack of knowledge regarding "how big a grain of salt" one should apply to AI-generated content.

Baxter noted that even AI detecting tools can make mistakes occasionally yet may be taken as always accurate in an interview with ZDNET, stressing that users accept all of this stuff as fact even if it is false. When generative AI and the tools that go along with it are employed in some fields, like education, these impressions could be detrimental since students might be falsely accused of employing AI in their work. 

She further raised concerns over such risks, urging individuals and organizations to use generative AI with ‘enough skepticism.’

She further highlighted the need for sufficient restrictions to ensure the safety and accuracy of AI. This will also help in case deployments are rolled out along with mitigation tools, she added. These can involve fault detection and reporting features, and mechanisms to collect and provide human feedback. 

Moreover, she emphasized the significance of the data used to train AI models and added that grounding AI is equally essential. But as she pointed out, not many businesses practice proper data hygiene.  

AI Models Produces Photos of Real People and Copyrighted Images


The infamous image generation models are used in order to produce identifiable photos of actual people. This leads to the privacy infringement of numerous individuals, as per a new research. 

The study demonstrates how these AI systems can be programmed to reproduce precisely copyrighted artwork and medical images. It is a result that might help artists who are suing AI companies for copyright violations.  

Research: Extracting Training Data from Diffusion Models 

Researchers from Google, DeepMind, UC Berkeley, ETH Zürich, and Princeton obtained their findings by repeatedly prompting Google’s Imagen with image captions, like the user’s name. Following this, they analyzed if any of the images they produced matched the original photos stored in the model's database. The team was successful in extracting more than 100 copies of photos from the AI's training set. 

These image-generating AI models are apparently produced over vast data sets, that consist of images with captions that have been taken from the internet. The most recent technology works by taking images in the data sets and altering pixels individually until the original image is nothing more than a jumble of random pixels. The AI model then reverses the procedure to create a new image from the pixelated mess. 

According to Ryan Webster, a Ph.D. student from the University of Caen Normandy, who has studied privacy in other image generation models but is not involved in the research, the study is the first to demonstrate that these AI models remember photos from their training sets. This could also serve as an implication for startups wanting to use AI models in health care since it indicates that these systems risk leaking users’ private and sensitive data. 

Eric Wallace, a Ph.D. scholar who was involved in the study group, raises concerns over the privacy issue and says they hope to raise alarm regarding the potential privacy concerns with these AI models before they are extensively implemented in delicate industries like medicine. 

“A lot of people are tempted to try to apply these types of generative approaches to sensitive data, and our work is definitely a cautionary tale that that’s probably a bad idea unless there’s some kind of extreme safeguards taken to prevent [privacy infringements],” Wallace says. 

Another major conflict between AI businesses and artists is caused by the extent to which these AI models memorize and regurgitate photos from their databases. Two lawsuits have been filed against AI by Getty Images and a group of artists who claim the company illicitly scraped and processed their copyrighted content. 

The researchers' findings will ultimately aid artists to claim that AI companies have violated their copyright. The companies may have to pay artists whose work was used to train Stable Diffusion if they can demonstrate that the model stole their work without their consent. 

According to Sameer Singh, an associate professor of computer science at the University of California, Irvine, these findings hold paramount importance. “It is important for general public awareness and to initiate discussions around the security and privacy of these large models,” he adds.