Attackers can conceal their efforts to execute malicious code by embedding commands into the machine code stored in memory by software interpreters used in many programming languages, such as VBScript and Python. This technique will be demonstrated by a group of Japanese researchers at next week's Black Hat USA conference.
Interpreters convert human-readable software code into bytecode, which are detailed programming instructions that the underlying virtual machine can understand. The research team managed to insert malicious instructions into the bytecode held in memory before execution. Since most security software does not scan bytecode, their changes went undetected.
This method could enable attackers to hide their malicious activities from most endpoint security software. Researchers from NTT Security Holdings Corp. and the University of Tokyo will showcase this capability using the VBScript interpreter, says Toshinori Usui, a research scientist at NTT Security. The researchers have confirmed that the technique also works for inserting malicious code into the in-memory processes of both the Python and Lua interpreters.
"Malware often hides its behavior by injecting malicious code into benign processes, but existing injection-type attacks have characteristic behaviors ... which are easily detected by security products," Usui says. "The interpreter does not care about overwriting by a remote process, so we can easily replace generated bytecode with our malicious code — it's that feature we exploit."
While bytecode attacks are not entirely new, they are relatively novel. In 2018, researchers from the University of California at Irvine published a paper introducing bytecode attacks and defenses. Last year, the administrators of the Python Package Index (PyPI) removed a malicious package known as fshec2, which escaped initial detection because its malicious code was compiled as bytecode. Python compiles its bytecode into PYC files, which the Python interpreter can execute.
"This may be the first supply chain attack to leverage the fact that Python bytecode (PYC) files can be directly executed, and it comes amid a spike in malicious submissions to the Python Package Index," Karlo Zanki, a reverse engineer at ReversingLabs, said in a June 2023 analysis of the incident. "If so, it poses yet another supply chain risk going forward, since this type of attack is likely to be missed by most security tools, which only scan Python source code (PY) files."
Beyond Precompiled Malware
After an initial compromise, attackers have several options to extend their control over a targeted system: They can perform reconnaissance, attempt further system compromise using malware, or use existing tools on the system — a strategy known as "living off the land."
The NTT researchers' bytecode attack technique falls into the latter category. Instead of using pre-compiled bytecode files, their attack — called Bytecode Jiu-Jitsu — involves injecting malicious bytecode into the memory space of a running interpreter. Since most security tools do not inspect bytecode in memory, the attack can conceal the malicious commands from detection.
This approach allows attackers to bypass other more obviously malicious steps, such as calling suspicious APIs to create threads, allocating executable memory, and modifying instruction pointers, Usui explains.
"While native code has instructions directly executed by the CPU, bytecode is just data to the CPU and is interpreted and executed by the interpreter," he says. "Therefore, unlike native code, bytecode does not require execution privilege, [and our technique] does not need to prepare a memory region with execution privilege."
Improving Interpreter Defenses
Interpreter developers, security tool developers, and operating system architects can all help mitigate this problem. Although bytecode attacks do not exploit vulnerabilities in interpreters, but rather their method of code execution, certain security measures like pointer checksums could reduce the risk, according to the UC Irvine paper.
The NTT Security researchers noted that checksum defenses would likely be ineffective against their techniques and recommend that developers enforce written protections to mitigate the risk. "The ultimate countermeasure is to restrict the memory write to the interpreter," Usui says.
Presenting a new attack technique aims to show security researchers and defenders what could be possible, not to inform attackers' strategies, Usui emphasizes. "Our goal is not to abuse defensive tactics, but to ultimately be an alarm bell for security researchers around the world," he says.