One in ten GenAI prompts puts sensitive data at risk

Despite their potential, many organizations hesitate to fully adopt GenAI tools due to concerns about sensitive data being inadvertently shared and possibly used to train these systems, according to Harmonic.

GenAI prompts risk

Sensitive data exposure in GenAI prompts

A new study, based on tens of thousands of prompts from business users, reveals that nearly one in ten potentially disclose sensitive data.

The prompts have been analyzed by Harmonic Security during Q4 of 2024 and monitor the use of GenAl tools including Microsoft Copilot, OpenAl ChatGPT, Google Gemini, Anthropic’s Claude, and Perplexity.

In the vast majority of cases, employee behavior when using GenAI tools is straightforward. Users commonly ask to summarize a piece of text, edit a blog, or write documentation for code. However, 8.5% of prompts are a concern and put sensitive information at risk.

Of this number, 45.8% of prompts potentially disclosed customer data, such as billing information and authentication data. A further 26.8% contained information on employees, including payroll data, PII, and employment records. Some prompts even asked GenAI to conduct employee performance reviews.

Of the remainder, legal and finance data accounted for 14.9%. This included information on sales pipeline data, investment portfolios, and M&A activity. Security-related information, comprising 6.9% of sensitive prompts, is particularly concerning.

Examples include penetration test results, network configurations, and incident reports. Such data could provide attackers with a blueprint for exploiting vulnerabilities. Finally, sensitive code, such as access keys and proprietary source code, constituted the remaining 5.6% of sensitive prompts potentially disclosed.

Free GenAI services pose security threat

Also of concern is the number of employees using the free tiers of GenAI services that typically don’t have the security features that ship with enterprise versions. Many free-tier tools explicitly state they train on customer data, meaning sensitive information entered could be used to improve models.

Of the GenAI models assessed 63.8% of ChatGPT users used the free tier, compared with 58.6% of those using Gemini, 75% for Claude, and 50.5% for Perplexity.

“Most GenAI use is mundane but the 8.5% of prompts we analyzed potentially put sensitive personal and company information at risk. In most cases, organizations were able to manage this data leakage by blocking the request or warning the user about what they were about to do. But not all firms have this capability yet. The high number of free subscriptions is also a concern, the saying that ‘if the product is free, then you are the product’ applies here and despite the best efforts of the companies behind GenAI tools there is a risk of data disclosure,” said Alastair Paterson, CEO at Harmonic Security.

Organizations must move beyond “block” strategies to manage GenAl risks effectively.

Don't miss