Search This Blog

Powered by Blogger.

Blog Archive

Labels

Showing posts with label JFrog Security. Show all posts

Data Poisoning: The Hidden Threat to AI Models



As ongoing developments in the realms of artificial intelligence and machine learning take place at a dynamic rate, yet another new form of attack is emerging, one which can topple all those systems we use today without much ado: data poisoning. This type of attack involves tampering with data used by AI models in training to make them malfunction, often undetectably. The issue came to light when recently, more than 100 malicious models were uncovered on the popular repository for AI, Hugging Face, by a software management company, JFrog. 

What is Data Poisoning?

Data poisoning is an attack method on AI models by corrupting the data used for its training. In other words, the intent is to have the model make inappropriate predictions or choices. Besides, unlike traditional hacking, it doesn't require access to the system; therefore, data poisoning manipulates input data either before the deployment of an AI model or after the deployment of the AI model, and that makes it very difficult to detect.

One attack happens at the training phase when an attacker manages to inject malicious data into any AI model. Yet another attack happens post-deployment when poisoned data is fed to the AI; it yields wrong outputs. Both kinds of attacks remain hardly detectable and cause damage to the AI system in the long run.

According to research by JFrog, investigators found a number of suspicious models uploaded to Hugging Face, a community where users can share AI models. Those contained encoded malicious code, which the researchers believe hackers-those potentially coming from the KREOnet research network in Korea-might have embedded. The most worrying aspect, however, was the fact that these malicious models went undetected by masquerading as benign.

That's a serious threat because many AI systems today use a great amount of data from different sources, including the internet. In cases where attackers manage to change the data used in the training of a model, that could mean anything from misleading results to actual large-scale cyberattacks.

Why It's Hard to Detect

One of the major challenges with data poisoning is that AI models are built by using enormous data sets, which makes it difficult for researchers to always know what has gone into the model. A lack of clarity of this kind in turn creates ways in which attackers can sneak in poisoned data without being caught.

But it gets worse: AI systems that scrape data from the web continuously in order to update themselves could poison their own training data. This sets up the alarming possibility of an AI system's gradual breakdown, or "degenerative model collapse."

The Consequences of Ignoring the Threat

If left unmitigated, data poisoning could further allow attackers to inject stealth backdoors in AI software that enable them to conduct malicious actions or cause any AI system to behave in ways unexpected. Precisely, they can run malicious code, allow phishing, and rig AI predictions for various nefarious uses.

The cybersecurity industry must take this as a serious threat since more dependence occurs on generative AI linked together, alongside LLMs. If one fails to do so, widespread vulnerability across the complete digital ecosystem will result.

How to Defend Against Data Poisoning

The protection of AI models against data poisoning calls for vigilance throughout the process of the AI development cycle. Experts say that this may require oversight by organisations in using only data from sources they can trust for training the AI model. The Open Web Application Security Project, or OWASP, has provided a list of some best ways to avoid data poisoning; a few of these include frequent checks to find biases and abnormalities during the training of data.

Other recommendations come in the form of multiple AI algorithms that verify results against each other to locate inconsistency. If an AI model starts producing strange results, fallback mechanisms should be in place to prevent any harm.

This also encompasses simulated data poisoning attacks run by cybersecurity teams to test their AI systems for robustness. While it is hard to build an AI system that is 100% secure, frequent validation of predictive outputs goes a long way in detecting and preventing poisoning.

Creating a Secure Future for AI

While AI keeps evolving, there is a need to instil trust in such systems. This will only be possible when the entire ecosystem of AI, even the supply chains, forms part of the cybersecurity framework. This would be achievable through monitoring inputs and outputs against unusual or irregular AI systems. Therefore, organisations will build robust, and more trustworthy models of AI.

Ultimately, the future of AI hangs in the balance with our capability to race against emerging threats like data poisoning. In sum, the ability of businesses to proactively take steps toward the security of AI systems today protects them from one of the most serious challenges facing the digital world.

The bottom line is that AI security is not just about algorithms; it's about the integrity for the data powering those algorithms.


 

22,000 PyPI Packages Affected by Revival Hijack Supply-Chain Attack

 


It has been discovered that hackers can distribute malicious payloads easily and efficiently through the package repository on the PyPI website by using a simple and troublesome exploit. A JFrog security researcher has discovered a new supply chain attack technique using which they can attack PyPI repositories (Python Package Index) that can be used to hack them. 

Hundreds of thousands of software packages can potentially be affected by this attack technique and countless users could be affected as a result. A technique known as "Revival Hijack," exploits a policy loophole by which attackers may re-register the names of packages that have been removed from PyPI by their original developers and hijack the names themselves once the packages have been removed from PyPI. 

As part of an attack against the Python Package Index (PyPI) registry, a new supply chain attack technique has been uncovered in the wild, which is designed to infiltrate downstream organizations by exploiting the PyPI registry. There is an attack vector called "Revival Hijack" which involves the registration of a new project with a name that matches a package that has been removed from the PyPI platform which may then serve as an attack vector. 

If a threat actor manages to do this, then they will be able to distribute malicious code to developers who pull updates periodically. A software supply chain security firm named JFrog, which specializes in software supply chain security, has codenamed this attack method Revival Hijack, claiming to be able to hijack 22,000 existing PyPI packages, which in turn will result in hundreds of thousands of malicious packages being downloaded. 

There are more than 100,000 downloads or six months' worth of activity on the affected packages and are more susceptible to exploits. A very common technique used by Revival Hijack is to take advantage of the fact that victims are often unknowingly updating once-safe packages without being aware that they have been altered or compromised. Further, CI/CD machines are set up with a mechanism for automatically installing package updates so that they can be applied right away. 

A similar attack technique was discovered by Jfrog earlier this year, which is one of several different attacks that adversaries have been developing in recent years to try and sneak malware into enterprise environments using public code repositories like PyPI, npm, Maven Central, NuGet, and RubyGems, and to steal sensitive data. As a part of these attacks, popular repositories have often been cloned and infected, poisoning artifacts have been used, and leveraged leaked secrets such as private keys and database certificates have been revealed. 

According to JFrog researchers Brian Moussalli and Andrey Polkovnichenko, there is a much higher risk here than in previous software supply chain hacks that relied primarily on typosquatting and human error to distribute malicious code throughout software websites. When a developer decides to delete a project from PyPI, they are given a warning about the potential repercussions that may arise, including the Revival Hijack scenario that could occur. 

The dialogue warns that deleting this project will give the name of the project to anyone else who uses PyPI", so please refrain from doing so. In this scenario, the user will be able to issue new releases under the project name as long as the distribution files have not been renamed to match those from a previously released distribution. According to the motive of the attacker, the "Revival Hijack" attack vector can result in hundreds of thousands of increments as a result of the attack, depending on the motive. 

As far as exploiting this technique is concerned, it can be applied to exploiting abandoned package names to spread malware. Researchers observed this in action with the hijack of the "pingdomv3" package, which was detected by research teams. This package has been given the version number 0.0.0.1 to avoid a dependency confusion attack scenario, in which developer packages would be pulled by pip upgrade commands when they were run as a part of the upgrade process. 

In addition, it is worth noting that Revival Hijack has already been exploited in the wild, by an unknown threat actor called Jinnis who introduced a benign version of a package titled "pingdomv3" on March 30, 2024, just two days after the original package's owner (cheneyyan) removed it from PyPI. There has been a report that says the new developer has released an update containing a Base64-encoded payload, which checks for the presence of the "JENKINS_URL" environment variable, and if it exists, executes an unknown next-stage module retrieved from a remote server after checking for the appearance of the "JENKINS_URL." environment variable. 

Although JFrog proposed this precaution as a preventative measure, over the last three months it has received nearly 200,000 downloads both manually and automatically, proving that the Revival Hijack threat is very real, the security company announced. In making an analysis of this data, JFrog reported that there are outdated jobs and scripts out there that are still searching for the deleted packages, as well as users who manually downloaded these packages due to typosquatting. 

Depending on how the hijacked packages are hijacked, the adversaries may attach a high version number to each package, which will cause the CI/CD systems to automatically download the hijacked packages believing they are the latest version. This will ultimately cause a bug to develop, JFrog explained. As a result of the company's recommendation, PyPI has effectively prohibited the reuse of abandoned package names as well.

Some organizations use PyPI that need to be aware of this attack vector when updating to new versions of the package, JFrog warns. There is a non-public blacklist maintained by PyPI, which prevents certain names from being registered on new projects, but most deleted packages don't make it to that list because there is a non-public blacklist maintained by PyPI. It was due to this that the security firm took indirect measures to mitigate the "Revival Hijack" threat and added the most popular of the deleted and vulnerable packages to an account named security_holding under which they could be monitored. 

As a result of the researchers changing the version numbers of the abandoned packages to 0.0.0.1, they make sure that it does not affect active users while updating the packages. As a result, the package names are preserved and are not susceptible to theft by malicious actors who may want to use them for offensive purposes. The third month later, JFrog discovered that the packages in their repository seemed to have been downloaded by nearly 200,000 people due to automatic scripts or user errors. There are a lot more risks involved in "Revival Hijack" than the standard typosquatting attacks on PyPI. 

This is because users pulling updates for their selected projects for which they have permission do not make mistakes when doing so. It's best to mitigate this threat by utilizing package pinning to stay on a known secure version, verify the integrity of the package, audit its contents, and watch for any changes in package ownership or unusual updates.