The aforementioned findings were made by researchers from the Department of Energy’s Pacific Northwest National Laboratory based on an abstract simulation of the digital conflict between threat actors and defenders in a network and trained four different DRL neural networks in order to expand rewards based on minimizing compromises and network disruption.
The simulated attackers transitions from the initial access and reconnaissance phase to other attack stages until they arrived at their objective, i.e. the impact and exfiltration phase. Apparently, these strategies were based on the classification of the MITRE ATT&CK architecture.
Samrat Chatterjee, a data scientist who presented the team's work at the annual meeting of the Association for the Advancement of Artificial Intelligence in Washington, DC, on February 14, claims that the successful installation and training of the AI system on the simplified attack surfaces illustrates the defensive responses to cyberattacks that, in current times, could be conducted by an AI model.
"You don't want to move into more complex architectures if you cannot even show the promise of these techniques[…]We wanted to first demonstrate that we can actually train a DRL successfully and show some good testing outcomes before moving forward," says Chatterjee.
AI Emerging as a New Trend in Cybersecurity
Machine learning (ML) and AI tactics have emerged as innovative trends to administer cybersecurity in a variety of fields. This development in cybersecurity has started from the early integration of ML in email security in the early 2010s to utilizing ChatGPT and numerous AI bots that we see today to analyze code or conduct forensic analysis. The majority of security products now incorporate a few features that are powered by machine learning algorithms that have been trained on massive datasets.
Yet, developing an AI system that is capable of proactive protection is still more of an ideal than a realistic approach. The PNNL research suggests that an AI defender could be made possible in the future, despite the many obstacles that still need to be addressed by researchers.
"Evaluating multiple DRL algorithms trained under diverse adversarial settings is an important step toward practical autonomous cyber defense solutions[…] Our experiments suggest that model-free DRL algorithms can be effectively trained under multistage attack profiles with different skill and persistence levels, yielding favorable defense outcomes in contested settings," according to a statement published by the PNNL researchers.
How the System Uses MITRE ATT&CK
The initial objective of the research team was to develop a custom simulation environment based on an open-source toolkit, Open AI Gym. Through this environment, the researchers created attacker entities with a range of skill and persistence levels that could employ a selection of seven tactics and fifteen techniques from the MITRE ATT&CK framework.
The attacker agents' objectives are to go through the seven attack chain steps—from initial access to execution, from persistence to command and control, and from collection to impact—in the order listed.
According to Chatterjee of PNNL, it can be challenging for the attacker to modify their strategies in response to the environment's current state and the defender's existing behavior.
"The adversary has to navigate their way from an initial recon state all the way to some exfiltration or impact state[…] We're not trying to create a kind of model to stop an adversary before they get inside the environment — we assume that the system is already compromised," says Chatterjee.
Not Ready for Prime Time
In the experiments, it was revealed that a particular reinforcement learning technique called a Deep Q Network successfully solved the defensive problem by catching 97% of the intruders in the test data set. Yet the research is just the beginning. Yet, security professionals should not look for an AI assistant to assist them with incident response and forensics anytime soon.
One of the many issues that are required to be resolved is getting RL and deep neural networks to explain the causes that affected their decision, an area of research called explainable reinforcement learning (XRL).
Moreover, the rapid emergence of AI technology and finding the most effective tactics to train the neutral network are both a challenge that needs to be addressed, according to Chatterjee.