Google’s AI Red Team: Advancing cybersecurity on the AI frontier
With the rise of ML, traditional red teams tasked with probing and exposing security vulnerabilities found themselves facing a new set of challenges that required a deep and comprehensive understanding of machine learning. Google’s recent announcement about the formation of a dedicated AI Red Ream has raised curiosity and interest within the tech community.
In this Help Net Security interview, Daniel Fabian, Head of Google Red Teams, shares insights into the significance of his team, the challenges they face, and the impact they are making in securing AI-driven technologies.
Recently, Google unveiled the creation of a dedicated AI red team. How does this differ from a traditional red team, and why was it necessary to have a separate AI red team?
The key difference is the expertise on the two teams. Attacking ML systems requires a very deep and detailed understanding of machine learning technology, which was not the focus of the traditional red team. It is a highly specialized skillset, and it would be hard to find people who can both hack into systems and poison models. However, it’s possible to pair people who can do the hacking and adversarial ML parts to achieve the same result.
As mentioned in the article, the two teams are organizationally closely aligned (Daniel, the author of the article, is leading both teams), and are often collaborating on exercises. We have found that combining classic security attack vectors with new ML-specific tactics, techniques and procedures (TTPs) is an excellent way to proactively identify and resolve issues, resulting in safer ML deployments. The teams cross-pollinate ideas and skillsets and come up with innovative vectors.
Google’s AI red team has key goals to advance its mission. Can you discuss your specific strategies or initiatives to achieve these goals?
The AI red team closely observes both new adversarial research that is being published, as well as where Google is integrating AI into products. We’re prioritizing AI red team exercises where we simulate an adversary pursuing specific goals. The results of these exercises are documented and shared with relevant stakeholders such as detection and response teams, the teams who own targeted systems, and Google leadership.
We also take a broad approach to addressing findings from red team exercises, applying lessons not just to the targeted areas, but more broadly to all products that can benefit. Each exercise that is executed gets us closer to achieving our goals.
How does the AI red team specifically target AI deployments? What are some of the unique challenges posed by these AI systems?
At the beginning of an exercise, the AI red team sets up a scenario, describing who the simulated attacker is, what their capabilities are, and what goals they would like to achieve. To come up with this scenario, the team relies on threat intelligence. Since threat intelligence is still a bit scarce in this space, the team also augments it with what they believe attackers could be targeting in the future based on their understanding of the space. From there, they start brainstorming how a real adversary might be able to achieve these goals.
Similar to security red teams, it’s often not possible to achieve a realistic goal in a single step, so many times the team executes multiple attacks, with each step getting closer to their goal.
In traditional exercises, when a potential issue is identified, the team can usually start trying to exploit it right away. For ML exercises, there is often a need for more research into how theoretical attacks can be applied in practice.
Could you discuss collaboration between the red team and AI experts for realistic adversarial simulations?
Some of the tactics, techniques and procedures (TTPs) that we use in exercises to target AI deployments, and are mentioned in the report, require specific internal access that an external attacker would not have. This is when our AI Red Team is collaborating with the security red team to get in that position. We can’t speak to specific exercises, but here is a fictional example: The traditional red team would be executing an attack, like the plasma globe featured in the Hacking Google video, compromising an employee and gaining access to internal systems. From this position, the AI Red Team might then target an ML model to put a backdoor into it.
Addressing red team findings can often be challenging, with some attacks not having simple fixes. How does Google tackle these challenges?
The security and privacy of our users is always our top priority. If we cannot launch a new feature safely, we don’t launch it, regardless of how cool it might be. Where there is no simple solution to an identified issue, the AI Red Team collaborates closely with internal research teams that work hard to research new approaches to address these gaps. By sharing engaging stories in the form of attack narratives, the AI Red Team helps drive more visibility and investment in ML Safety. Often classic security mitigations, such as restricting access, input/output validation, and many others, can also significantly reduce the risk.