MITRE partners with Microsoft to address generative AI security risks
MITRE and Microsoft have added a data-driven generative AI focus to MITRE ATLAS, a community knowledge base that security professionals, AI developers, and AI operators can use as they protect AI-enabled systems.
This new framework update and associated new case studies directly address unique vulnerabilities of systems that incorporate generative AI and LLM like ChatGPT and Bard.
The updates to MITRE ATLAS—which stands for Adversarial Threat Landscape for Artificial-Intelligence Systems—are intended to realistically describe the rapidly increasing number and type of attack pathways in LLM-enabled systems that consumers and organizations are rapidly adopting. Such characterizations of realistic AI-enabled system attack pathways can be used to strengthen defenses against malicious attacks across a variety of consequential applications of AI, including in healthcare, finance, and transportation.
“Many are concerned about security of AI-enabled systems beyond cybersecurity alone, including large language models,” said Ozgur Eris, managing director of MITRE’s AI and Autonomy Innovation Center. “Our collaborative efforts with Microsoft and others are critical to advancing ATLAS as a resource for the nation.”
“Microsoft and MITRE worked with the ATLAS community to launch the first version of the ATLAS framework for tabulating attacks on AI systems in 2020, and ever since, it has become the de facto Rosetta Stone for security professionals to make sense of this ever-shifting AI security space,” said Ram Shankar Siva Kumar, Microsoft data cowboy. “Today’s latest ATLAS evolution to include more LLM attacks and case studies underscores the framework’s incredible relevance and utility.”
MITRE ATLAS is a globally accessible, living knowledge base of adversary tactics and techniques based on real-world attack observations and realistic demonstrations from AI red teams and security groups. The ATLAS project involves global collaboration with well over 100 government, academic, and industry organizations.
Under that collaboration umbrella, MITRE and Microsoft have worked together to expand ATLAS and develop tools based on the framework to enable industry, government, and academia as we all work to increase the security of our AI-enabled systems.
These new ATLAS tactics and techniques are grounded in case studies from incidents users or security researchers discovered that occurred in 2023 including:
- ChatGPT plugin privacy leak: Uncovered an indirect prompt injection vulnerability within ChatGPT, where an attacker can feed malicious websites through ChatGPT plugins to take control of a chat session and exfiltrate the history of the conversation.
- PoisonGPT: Demonstrated how to successfully modify a pre-trained LLM to return false facts. As part of this demonstration, the poisoned model was uploaded to the largest publicly-accessible model hub to illustrate the consequences posed to the LLM’s supply chain. As a result, users who downloaded the poisoned model were at risk of receiving and spreading misinformation.
- MathGPT code execution: Exposed a vulnerability within MathGPT—which uses GPT-3 to answer math questions—to prompt injection attacks, allowing an actor to gain access to the host system’s environment variables and the app’s GPT-3 API key. This could enable a malicious actor to charge MathGPT’s GPT account for its own use, causing financial harm, or cause a denial-of-service attack that could hurt MathGPT’s performance and reputation. The vulnerabilities were mitigated after disclosure.
The broader ATLAS community of industry, government, academia, and other security researchers also provided feedback to shape and inform these new tactics and techniques.
The ATLAS community collaboration will now focus on incident and vulnerability sharing to continue to grow the community’s anonymized dataset of real-world attacks and vulnerabilities observed in the wild. The incident and vulnerability sharing work has also expanded to incorporate incidents in the broader AI assurance space, including AI equitability, interpretability, reliability, robustness, safety, and privacy enhancement.
The ATLAS community is also sharing information on addressing supply chain issues, including AI bill of materials (BOM) and model signing, and provenance best practices through the ATLAS GitHub page and Slack channel, which are open to the public.
The community will be using the Slack and GitHub forums to share what is currently working in their organizations so that current AI supply chain risk mitigation practices and techniques can be better aligned.