Red teaming in the AI era
As AI gets baked into enterprise tech stacks, AI applications are becoming prime targets for cyber attacks. In response, many cybersecurity teams are adapting existing cybersecurity practices to mitigate these new threats. One such practice measure is red teaming: the effort to expose weaknesses in a system and develop responses to found threats by playing the role of the enemy.
While this exercise is certainly an essential one, recent reports and anecdotal evidence show us that red teaming isn’t quite as straightforward when it comes to securing AI applications.
To effectively safeguard these new environments, cybersecurity teams need to understand the shifting nuances of red teaming in the context of AI. Understanding what’s changed with AI (and what hasn’t) is an essential starting point to guide red teaming efforts in the years ahead.
Why AI flips the red teaming script
In the pre-AI era, red teaming meant conducting a stealthy procedure of finding vulnerabilities and exploiting them, usually without a warning to the security team and with a specific goal in mind (e.g., accessing a server critical to business operations). But with the advent of AI, the red teaming process shifts. Instead of being a one-time ordeal with a singular goal, the process becomes far more frequent and widespread.
Unlike previous types of software, AI models become more intelligent over time. This constant change means new risks can emerge at any moment, making them incredibly difficult to anticipate. A one-and-done approach to red teaming simply won’t work. Because the abilities of these models increase over time, cyber teams are no longer red teaming a static model.
Another change: when you start working with a third-party LLM, all you can see is the model itself, not the data and code behind it. This is akin to assessing car trouble without being able to look under the hood, and it’s a sharp contrast to what we’re used to with traditional software.
Red teaming AI applications is no longer the simple process of having a checklist of things to look out for and going through it. To identify vulnerabilities, cyber teams need to constantly come up with creative ways to poke holes in models and closely monitor model behavior and output.
On top of that, teams must be painstakingly deliberate when red teaming an LLM with external plugins. The interconnectedness of LLMs requires that you red team the system in its entirety, starting with a very clear objective. For example, let’s say you want to make an LLM disclose sensitive information. Once you succeed in generating that vulnerability, you need to identify not just model weaknesses, but also system-wide safeguards for mitigating the downstream effects of that type of attack.
When dealing with AI, it’s not just about red teaming your models: red teaming interconnected applications matters, too. Only by broadening the lens of your red teaming efforts can you sufficiently identify potential vulnerabilities and proactively build operational protection around them.
Cybersecurity 101 still applies with AI
As with any type of software, red teaming alone is never enough. With LLMs in particular, operational protections are essential to attack prevention. There will be new threats every day, and you need structures and procedures that guard your applications at all times. Bottom line: AI security requires you to pull out all the stops, and all traditional cybersecurity practices have to remain in play.
For instance, you have to sanitize your databases. If you have an LLM that can access an internal database, make sure the data is scrubbed before it enters the model. Additionally, keep your access controls in check. LLMs should only have the least possible privilege to minimize damages in the event of a compromise.
Securing AI models is a completely new challenge that nearly every cybersecurity company will eventually need to tackle. Red teaming is a great place to start, but it also requires that you challenge your understanding of the red teaming process and complement those efforts with tried-and true security strategies. The more cybersecurity pros who master these nuances, the closer we’ll be to delivering on the promise of AI.