Excessive agency in LLMs: The growing risk of unchecked autonomy

For an AI agent to “think” and act autonomously, it must be granted agency; that is, it must be allowed to integrate with other systems, read and analyze data, and have permissions to execute commands. However, as these systems gain deep access to information systems, a growing concern is mounting about their excessive agency – the security risk of entrusting these tools with so much power, access, and information.

Say that an LLM is granted access to a CRM database that stores sensitive client data (names, contact info, purchase history, etc.). What if it allows users to not only access their own customer record but also permits them to access and delete the entries of other users? Thus, excessive agency refers to scenarios where the LLM executes unauthorized commands, makes unintended information disclosures, and interacts with other systems beyond its defined scope, guidelines, and parameters.

The root causes of excessive agency in LLMs

Excessive agency is a type of vulnerability that occurs when an LLM is granted.

Excessive functionality: An LLM agent has access to functions, APIs, or plugins beyond its original or intended scope. For example, an LLM integrated into a smart home system can not only turn lights on or off but also disable alarm systems, security cameras, and lock or unlock doors.

Excessive permissions: An LLM agent has access to more permissions than necessary. For example, an email assistant that can read, write, or delete emails also has access to instant messages and sensitive files (spreadsheets, company records) on the user’s drive.

Excessive autonomy: An LLM agent that acts unpredictably or beyond its operational and ethical guardrails to achieve its objective. For example, an LLM agent that manages social media misinterprets a user’s question and shares sensitive information or makes an inappropriate response, which leads to data leakage or reputation damage.

Top risks of excessive agency

When LLM agents are granted excessive agency, they can compromise the core principles of security:

Confidentiality: An LLM retrieves confidential information from a database and exposes it to an unauthorized user.

Integrity: An LLM with excessive autonomy or functionality performs an unauthorized action due to ambiguous, manipulated, or adversarial inputs.

Availability: An LLM with excessive agency is compromised or exploited by attackers, disabling networks and overloading servers, causing significant disruption and downtime.

How threat actors exploit and abuse excessive agency in LLMs

Threat actors use different techniques to exploit excessive agency in LLMs:

  • Direct prompt injection: Attackers give malicious instructions and trick LLMs into executing harmful commands or disclosing sensitive data
  • Indirect prompt injection: Attackers embed harmful instructions into an external source like a website or document that the LLM has access to
  • Privilege escalation: An LLM is tricked into granting a higher-level access by the attacker
  • Model manipulation: Attackers poison the LLM model or inject biases or vulnerabilities, triggering malicious behavior
  • Data exfiltration: Attackers craft prompts that manipulate LLMs into exposing sensitive data.

How can organizations mitigate excessive agency?

By implementing the security strategies mentioned below organizations can reduce the likelihood of excessive agency attacks, abuse, or exploitation by threat actors:

  • Incorporate ethical guardrails: Establish an AI code of conduct to ensure AI actions are aligned with organizational policies.
  • Limit LLM agency: Enforce strict boundaries around what LLMs can and cannot do. Any kind of agency must be granted with extreme caution.
  • Validate and sanitize inputs: Scrutinize and sanitize all inputs using filters, block lists, and pre-defined rules.
  • Incorporate human-in-the-loop: Require a human review or approval for high-risk actions.
  • Granular access controls: Restrict the model from interacting with other systems unless it is explicitly authorized.
  • Monitor LLM behavior continuously: Use monitoring tools to track LLM behavior and trigger alerts when suspicious activities or anomalies are detected.
  • Implement mediation: Instead of depending on LLM agents to decide whether an action is permitted or not, implement authorization checks (all requests are validated against security policies) in downstream systems.
  • Apply rate limiting: Apply a cap on the number of actions an LLM can make within a stipulated time frame.
  • Validate LLM security: Run penetration tests and red teaming exercises to identify loopholes and vulnerabilities proactively and to validate the performance of existing LLM safety standards.

Excessive agency in autonomous LLMs poses significant risks for organizations. It is essential that businesses adapt their security approaches to mitigate the many risks presented by these next-gen AI systems.

Don't miss