Warning: PyPI Feature Executes Code Automatically After Python Package Download

Cyber Security

In another finding that could expose developers to increased risk of a supply chain attack, it has emerged that nearly one-third of the packages in PyPI, the Python Package Index, trigger automatic code execution upon downloading them.

“A worrying feature in pip/PyPI allows code to automatically run when developers are merely downloading a package,” Checkmarx researcher Yehuda Gelb said in a technical report published this week.

“Also, this feature is alarming due to the fact that a great deal of the malicious packages we are finding in the wild use this feature of code execution upon installation to achieve higher infection rates.”

One of the ways by which packages can be installed for Python is by executing the “pip install” command, which, in turn, invokes a file called “setup.py” that comes bundled along with the module.

“setup.py,” as the name implies, is a setup script that’s used to specify metadata associated with the package, including its dependencies.

While threat actors have resorted to incorporating malicious code in the setup.py file, Checkmarx found that adversaries could achieve the same goals by running what’s called a “pip download” command.

“pip download does the same resolution and downloading as pip install, but instead of installing the dependencies, it collects the downloaded distributions into the directory provided (defaulting to the current directory),” the documentation reads.

In other words, the command can be used to download a Python package without having to install it on the system. But as it turns out, executing the download command also runs the aforementioned “setup.py” script, resulting in the execution of malicious code contained within it.

However, it’s worth noting that the issue occurs only when the package contains a tar.gz file instead of a wheel (.whl) file, which “cuts the ‘setup.py’ execution out of the equation.”

“Developers opting to download, instead of installing packages, are reasonably expecting that no code will run on the machine upon downloading the files,” Gelb noted, characterizing it as a design issue rather than a bug.

Although pip defaults to using wheels instead of tar.gz files, an attacker could take advantage of this behavior to intentionally publish python packages without a .whl file, leading to the execution of the malicious code present in the setup script.

“When a user downloads a python package from PyPi, pip will preferentially use the .whl file, but will fall back to the tar.gz file if the .whl file is lacking,” Gelb said.

The findings come as the U.S. National Security Agency (NSA), along with the Cybersecurity and Infrastructure Security Agency (CISA) and the Office of the Director of National Intelligence (ODNI), released guidance for securing the software supply chain.

“As the cyber threat continues to become more sophisticated, adversaries have begun to attack the software supply chain, rather than rely on publicly known vulnerabilities,” the agency said. “Until all DevOps are DevSecOps, the software development lifecycle will be at risk.”