Python backdoor attacks and how to prevent them
Python backdoor attacks are increasingly common. Iran, for example, used a MechaFlounder Python backdoor attack against Turkey last year. Scripting attacks are nearly as common as malware-based attacks in the United States and, according to the most recent Crowdstrike Global Threat Report, scripting is the most common attack vector in the EMEA region.
Python’s growing popularity among attackers shouldn’t come as a surprise. Python is a simple but powerful programming language. With very little effort, a hacker can create a script of less than 100 lines that establishes persistence, so that even if you kill the process, it will start itself back up, establish a backdoor, obfuscate communications both internally and with external servers and set up command and control links. And if an attacker doesn’t want to write the code, that’s no problem either. Python backdoor scripts are easy to find – a simple GitHub search turns up more than 200.
Scripting attacks are favored by cybercriminals and nation states because they are hard to detect by endpoint detection and response (EDR) systems. Python is heavily used by admins, so malicious Python traffic looks exactly like the traffic produced by day-to-day network management tools.
It’s also fairly easy to get these malevolent scripts onto targeted networks. Simply include a malicious script in a commonly used library, change the file name by a single character and, undoubtedly, someone will use it by mistake or include it as a dependency in some other library. That’s particularly insidious, given how enormous the list of dependencies can be in many libraries.
By adding a bit of social engineering, attackers can successfully compromise specific targets. If an attacker knows the StackOverflow usernames of some of the admins at their targeted organization, he or she can respond to a question with ready-to-copy Python code that looks completely benign. This works because many of us have been “trained” by software companies to copy and paste code to deploy their software. Everyone knows it isn’t safe, but admins are often pressed for time and do it anyway.
Anatomy of a Python backdoor attack
Now, let’s imagine a Python backdoor has established itself on your network. How will the attack play out?
First, it will probably try to establish persistence. There are many ways to do this, but one of the easiest is to establish a crontab that restarts the script, even if it’s killed. To stop the process permanently, you’ll need to kill it and the crontab in the right sequence at the right time. Then it will make a connection to an external server to establish command and control, obfuscating communications so they look normal, which is relatively easy to do since its traffic already resembles that of ordinary day-to-day operations.
At this point, the script can do pretty much anything an admin can do. Scripting attacks are often used as the point of the spear for multi-layered attacks, in which the script downloads malware and installs it throughout the environment.
Fighting back against Python backdoors
Scripting attacks often bypass traditional perimeter and EDR defenses. Firewalls, for example, use approved network addresses to determine whether traffic is “safe,” but it can’t verify exactly what is communicating on either end. As a result, scripts can easily piggyback on approved firewall rules. As for EDR, traffic from malicious scripts is very similar to that produced by common admin tools. There’s no clear signature for EDR defenses to detect.
The most efficient way to protect against scripting attacks is to adopt an identity-based zero trust approach. In a software identity-based approach, policies are not based on network addresses, but rather on a unique identity for each workload. These identities are based on dozens of immutable properties of the device, software or script, such as a SHA-256 hash of the binary, the UUID of the bios or a cryptographic hash of a script.
Any approach that’s based on network addresses cannot adequately protect the environment. Network addresses change frequently, especially in autoscaling environments such as the cloud or containers, and as mentioned earlier, attackers can piggyback on approved policies to move laterally.
With a software and machine identity-based approach, IT can create policies that explicitly state which devices, software and scripts are allowed to communicate with one another — all other traffic is blocked by default. As a result, malicious scripts would be automatically blocked from establishing backdoors, deploying malware or communicating with sensitive assets.
Scripts are rapidly becoming the primary vector for bad actors to compromise enterprise networks. By establishing and enforcing zero trust based on identity, enterprises can shut them down before they have a chance to establish themselves in the environment.