Q&A: Analyzing the security, compliance and cost benefits of tokenization
Tokenization has been billed as the magic data security bullet for retailers, offering strong protection for stored sensitive data and an attractive cost-saving strategy for achieving PCI compliance. The reported potential benefits are significant enough that other enterprises have begun seriously considering tokenization for inclusion in their own data security efforts. But does the technology live up to the hype? According to Protegrity’s Chief Technical Officer Ulf Mattsson, tokenization can provide measurable benefits when deployed as part of a risk-based holistic data security solution, but it’s not best suited for every business – in some cases, the expense and time spent fitting a system and applications for tokenization may outweigh the benefits.
In this interview, Mattsson examines the positive effects and potential drawbacks of tokenization and outlines the issues that retailers (and enterprises) should consider when weighing whether to deploy tokenization, as well as the system architecture, policies and procedures that should be implemented to get the best out of the technology.
Can you give us a basic definition of tokenization?
Tokenization replaces the original data field with a reference pointer or proxy value to the actual data field. So a token can be thought of as a claim check that an authorized user or system can use to obtain associated sensitive data such as a credit card number.
In the event of a breach of the database or another business application, could an attacker access the data associated with the tokens?
Only the tokens could be accessed, which would be of no value to a would-be attacker. Using tokenization, all credit card numbers stored in business applications and databases are removed from those systems and placed in a highly secure, centralized encryption management server that can be protected and monitored utilizing robust encryption technology.
What industries would benefit most from tokenization, both from a security and a compliance angle?
I believe that all industries can potentially benefit from this technology. Tokenization is less about innovative technology and more about understanding how to design systems and processes that minimize the risk of retaining data elements with intrinsic (or market) value. By centralizing and tokenizing data, organizations naturally minimize exposure of data. Security is immediately strengthened by minimizing the number of potential targets for would-be attackers. That said, any business that collects, processes or stores payment card data is the most likely to gain measurable benefits from tokenization.
Most of the tokenization packages available today are focused on the Point of Sale (POS), card data is removed from the process at the earliest point and a token number with no value to the attacker is substituted. These approaches are offered by third party gateway vendors and other service providers and they can certainly reduce the scope of a PCI review and risk to the card data. Additionally an enterprise tokenization strategy also reduces the overall risk to the enterprise that results from the ability of many persons to have access to confidential data, often beyond what can be justified by business needs. Tokenization, applied strategically to enterprise applications, can reduce ongoing confidential data management costs as well as the risk of a security breach and ease compliance with data protection regulations.
What common misconceptions do enterprises have about tokenization?
There has been some erroneous information published about tokenization, for example, that tokenization is better than encryption because “if you lose access to the encryption keys the data is gone forever.” But this issue exists both with tokenization and encryption, and in both cases can be managed through proper key management, which would include a secure key recovery process. Both a key server and a token server can crash, and both must have a backup for this reason. The token server is also often using encryption to encrypt the data that is stored there, so the token server may also lose the key. A solid key management solution and process is a critical part of the enterprise’s data protection plan.
I’ve also read articles encouraging businesses not to encrypt data that they plan to tokenize, claiming that that encrypted data takes more tokenization space than clear text data, and that many forms of sensitive data contain more characters than a 16-digit credit card number causing storage and manageability problems. But this is untrue if you are using Datatype-Preserving Encryption (DTP), which is not a new format – at Protegrity we have used DTP for more than four years. Telling people not to encrypt because of an issue that is easily addressed denies them a critical layer of security that may be the last, and best, line of defense for sensitive data.
There are other misconceptions floating around, but these two are the most blatant.
What issues should a business be aware of when considering whether to deploy tokenization?
Tokenization can be very complicated in larger retail environments when card data resides in many places. Card data can be very “dynamic” – used by many different applications and service providers. Applications which have a need to process the real value of the data would most likely need to be reworked to support tokenization, other applications may need to be modified. The cost for changing the application code can be hard to justify by the level of risk reduction. The risk of changing already working application code can also be hard to justify. The same is true for any organization that uses sensitive data in many different places in their business processes and applications, as making the switch to tokenization will probably require some programming changes.
Retailers should also be aware that card processors may offer end-to-end tokenization efforts as a way to technically “lock in” existing customers and as an attractive way to integrate card data management services with card processing services, drawing new customers by providing a “back end” to go with the “front end” of the POS tokenization offerings.
Merchants who also gather card data via Web commerce, call centers and other channels, should ensure that whatever product or service they use can also tokenize data through all of their channels. Also, not all offerings in the market work well or cost-effectively in a multi-channel environment with communication with multiple partners, particularly if the token service is outsourced. Merchants need to ensure that their requirements reflect current and near-future channel needs. Another concern is that tokenization is new and unproven can pose an additional risk relative to mature encryption solutions.
How can a business best mitigate these issues?
A risk management analysis can reveal whether the cost of deploying tokenization in house is worth the benefits.
One option to consider is outsource your credit card information and the tokenizing solution, assuming that outsourcing provides an acceptable level of risk – a larger organization can potentially provide more secure environment for its sensitive data in-house. An outsourcing environment must be carefully reviewed from a security point and also providing a reliable service to each globally connected endpoint. Many merchants continue to object to having anyone keep their card data other than themselves. Often, these are leading merchants that have made significant investments in data security and they simply do not believe that any other company has more motivation (or better technology) than they do to protect their data.
Assuming the company has opted to go ahead, what issues should be considered when planning the deployment of a tokenization solution?
Transparency, availability, performance, scalability and security are common concerns with tokenization, particularly if the service is outsourced. Transparency can be enhanced by selecting a tokenization solution that is well integrated into enterprise systems like databases. Availability can be best addressed by a selecting a tokenization solution that is running in-house on a high availability platform of your choice.
Performance can be best addressed by selecting a tokenization solution that is running locally on your high transaction volume servers. Scalability can be best addressed by a selecting a tokenization solution that is running in-house on your high performance corporate back-bone network. Security can be best addressed by a running the tokenization solution in-house in your high security network segment isolated from all other data and applications. Most tokenization requests may not need to access this highly sensitive server if a segmented approach is used. A secure access to the token server must be provided based on authentication, authorization, encrypted channel and monitoring or blocking of suspicious transaction volumes and requests.
Are there any compliance issues that a company needs to be aware of when deploying tokenization?
Some merchants and service providers have refused to consider tokenization because it is only mentioned in PCI 3.4 as “Index tokens and pads” as one of the methods for PAN protection but not well defined, unlike dozens of other security technologies. As such, they plan to wait until tokenization is addressed before taking action. Those merchants who are implementing tokenization specifically for PCI often cite PCI DSS 3.1, which says to keep cardholder data storage to a minimum. These merchants argue that tokenization reduces the number of instances of card data through the centralization and elimination of all but one instance of card data. Tokenization does not take the entire POS out of scope. If the POS accepts credit or debit cards, then the POS is in scope. But if properly implemented, tokenization can reduce the PCI scope and make compliance more manageable. Some forms of tokenization can even take entire applications out of scope.
What data security issues should an enterprise keep in mind regarding tokenization?
Tokenization can create a small number of extremely attractive targets for data thieves. Any tokenization server, in-house or particularly if outsourced, is a single point of failure and can be a concern as an attractive target for many talented crackers and someone would break down the defenses. This treasure trove of data would be equally attractive to privileged insiders.
Also, the access to the card data is still in your POS or other systems, so be careful of how a tokenized system can be attacked. Attackers out there are very determined, and there are fortunes being made from this kind of theft. The sophistication of botnets and worms constantly attacking the internet demonstrate that organized crime is hiring very talented people to attack systems. Plus, since stolen data has become a profitable illegal business, we have to assume that criminals aren’t above paying off, extortion and blackmail of highly placed insider employees, who might have access to the very routines and data you are trying to protect.
How can a company best address these issues?
Along with separation of duties and auditing, a tokenization solution requires a solid encryption and key management system to protect the information in the tokenizer server. By combining encryption with tokenization, organizations can benefits in terms of security, efficiency, and cost savings for some application areas within an enterprise. Companies also need to deploy an in-house foundation of an enterprise encryption and key management solution to lock down various data across the enterprise data flow also in transaction logs and data temporary storage used for sensitive data outside the tokenized data flow.
Also, look to holistic solutions that can support end-to-end field encryption, which is an important part of the next generation protection of the sensitive data flow. In some data flows the best combination is end-to-end field encryption utilizing format controlling encryption from the point of acquisition and into the central systems and at that point converting the field to a token for the permanent use within the central systems. A mature solution should provide this integration between encryption/tokenization processes.
Security can be best addressed by a running the tokenization solution in-house in your high security network segment isolated from all other data and applications, if your organization has these resources available. Most tokenization requests may not need to be authorized to access this highly sensitive server if a segmented and tiered approach is used. For example, these requests only need access to a less sensitive server that can map a token to a hash value of the credit card number.
What policies and processes should a company implement to enhance the security of tokenized data?
A tokenizing routine has to be guarded almost as securely as you would the decryption key to account numbers; an encryption key is more sensitive. As an example, you may use a hardware security module (HSM) to protect an encryption key and gaining access to the token server is like gaining accessing the HSM, Especially if the HSM returns DTP-encrypted data. The difference is that an HSM stores the key and doesn’t store data while the token server stores data, possibly in clear text.
If an attacker can access the tokenizer from an unknown, unauthenticated, or unaudited source, he can use it to perform his own “testing.” An added detection of frequency and volume of requests to the tokenizer could detect abnormal pattern of requests from a certain user/client but building a secure tokenizing solution that meet both the business requirements and the security requirements can be a complex job.
Also, a secure token should be randomly assigned or generated and not be based on an encryption algorithm to avoid the need for a distributed key rollover. Some products in the market are breaking this fundamental rule. A distributed key rollover may create issues with referential integrity during the key rollover operation since the token typically is not carrying any key indicator. If a solution is using an encryption algorithm, it should be called encryption and not tokenization.
How can tokenizing fit into the data lifecycle of an enterprise?
The combined approaches discussed above can be used to protect the whole data lifecycle in an enterprise and also provide high quality production level data in test environments, virtualized servers and outsourced environments.
In many cases there is a need during the development lifecycle to be able to perform high quality test scenarios on production quality test data by reversing the data hiding process. Key data fields that can be used to identify an individual or corporation need to be cleansed to depersonalize the information. Cleansed data needs to be easily restored (for downstream systems and feeding systems), at least in the early stages of implementation. This requires two-way processing. The restoration process should be limited to situations for which there is no alternative to using production data (interface testing with a third party or for firefighting situations, for example).
Authorization to use this process must be limited and controlled. In some situations, business rules must be maintained during any cleansing operation (addresses for processing, dates of birth for age processing, names for gender distinction). There should also be the ability to set parameters, or to select or identify fields to be scrambled, based on a combination of business rules.
Should a company build their own tokenizing solution?
Developing all the capabilities we’ve discussed in this interview can present significant challenges if a security team seeks to build a solution in-house. To be implemented effectively, all applications that currently house payment data must be integrated with the centralized tokenization server. Developing either of these interfaces would require a great deal of expertise in order to ensure performance and availability. Writing an application that is capable of issuing and managing tokens in heterogeneous environments and that can support multiple field length requirements can be complex and challenging. Furthermore, ongoing support of this application could be time consuming and difficult. Allocating dedicated resources to this large undertaking and covering for responsibilities this staff would otherwise be fulfilling could present logistical, tactical, and budgetary challenges.
For many organizations, locating the in-house expertise to develop such complex capabilities as key management, token management, policy controls, and heterogeneous application integration can be very difficult. Writing code that interfaces with multiple applications, while minimizing the performance impact on those applications, presents an array of challenges. The overhead of maintaining and enhancing a security product of this complexity can ultimately represent a huge resource investment and a distraction from an organization’s core focus and expertise. Security administrators looking to gain the benefits of centralization and tokenization, without having to develop and support their own tokenization server, should look at vendors that offer off-the-shelf solutions.