Encryption we can trust: Are we there yet?
Encryption is arguably the most important single security tool that we have, but it still has some serious growing up to do. The current debate about the pros and cons of ubiquitous encryption and the FBI’s request for Apple to unlock iPhones reinforces the public notion that encryption is unbreakable, even by the nation state, unless artificially weakened by backdoors.
Everyone in the industry knows this isn’t true – there is a difference between strong and weak encryption. Perhaps surprisingly those differences have almost nothing to do with encryption itself – or at least the math behind encryption. Encryption relies on secrets, digital keys to lock and unlock the data. Whether those secrets can be guessed or stolen is what makes all the difference.
The good news is that organizations are getting better at keeping keys secret – it took a while, but we’re getting there. The bad news is that guessing keys gets easier and easier for attackers – as computers get faster, guessing, inevitably gets easier.
The challenge is to only make keys that are truly random, perfectly unpredictable – otherwise we make the attacker’s job easier, perhaps catastrophically so. Once the key is known all protection is gone. This challenge is compounded by the fact that billions of keys are made every day. Almost every web connection, every email, every credit card transaction relies on them. It’s an issue of scale, not just quality.
So who owns the problem of generating keys? Unfortunately, in most cases, the answer is no-one.
Almost all keys are generated by the operating system, by a piece of software. The trouble is that software only does what it’s programmed to do. It doesn’t do random things, and when it does, we tend to call it a bug! To trigger behavior that is actually random, the operating system scavenges randomness (more properly called entropy) from wherever it can, ideally by sampling some aspect of its physical environment. Entropy can come from many sources, some better (more random) than others. Everything from user mouse clicks, to radio noise and timing jitter in the hardware can all yield entropy.
Operating systems and applications in different situations have access to different hardware capabilities, and exist in different physical environments. Compare the entropy available to an app running on your smart phone to one running in a lights-out datacenter in the cloud. Your phone potentially has a lot of available entropy sources but actually needs relatively few random numbers.
Whereas a high-speed web app running across hundreds of VMs in the cloud, needs lots of random numbers for thousands of SSL/TLS connections a second, but actually has access to very little entropy. One of the downsides of virtualization is that it acts as a firewall for entropy. By abstracting the application from the physical world, it cuts off its main supply of entropy. Unfortunately there’s not much entropy in the virtual world.
When you stand back, it’s hard not to come to the conclusion that the state of entropy generation, and therefore key generation, in most cases today is little more than a ‘best-effort’ activity. Linux illustrates this perfectly. When applications request random numbers from Linux, they typically use one of two sources: dev/random and dev/urandom. The difference between the two comes down to assurance, both in terms of quality and availability.
There’s a tradeoff to make. Users of dev/random can be reasonably confident that they will be provided good random numbers. But if Linux doesn’t believe it has adequate entropy to ensure randomness, then dev/random will freeze until sufficient entropy is gathered. dev/urandom, on the other hand, will always provide random numbers irrespective of how much entropy is available in the system – that’s what the ‘u’ stands for, unblocking.
The situation is analogous to having two faucets in your kitchen – one that delivers drinking water but may run dry and one that always delivers water, but might not be drinkable. The second faucet might be fine for taking a bath but not for quenching your thirst. Unfortunately, application developers can’t afford to be so fussy. Time is money and applications need to keep running. They can’t afford to hang around waiting for entropy to be gathered – they do the best they can but in the end they have to get on with their job. Developers nearly always use dev/urandom – and hope for the best.
This approach might be OK if somehow we could retrospectively inspect the keys we make and throw away the weak ones, a kind of quality control for keys. It’s a nice idea but proving a number is truly random, just by testing it, is impossible. The only way to be sure a number is truly random is to control the whole process by which it was generated in the first place. That requires a system-wide perspective which goes right back to the original sources of entropy.
Somehow we need to bring quality and consistency to the process. Ensuring that any one device has good access to entropy is important but the bigger challenge is to ensure that all devices have equally good access – every VM, every mobile app, every IoT device. Security is about consistency – instances of weakness becomes points of attack.
There’s a good argument that entropy should no longer be a localized best-effort. It should be a networked utility. Think about time. We used to configure computers, mobile phones, even VCRs to have the right time, just like our alarm clocks. Time was a localized, best-effort setting. In today’s digital world that no longer makes sense. Today, time is a networked service. The systems we rely on today magically know the right time, all the time, all on their own. Entropy and random number generation could well be heading the same way. All systems would be able to generate equally good random numbers because they would all have an equally good supply of entropy – delivered over the network as a background service.
It will soon be hard to find an application that doesn’t need random numbers and most will need crypto-strength, true random numbers. Entropy sourcing and random number generation shouldn’t be left to individual devices and VMs to do the best they can. The quality of random number generation should be independent of the local platform and environment.
Datacenters are utilitarian places, where utilities such as power, air conditioning, and even physical access controls are provided as standard. The race is now on to provide other security services as standardized utilities, and entropy and random number generation should be part of that mix. Poor random number generation is a basic security hygiene issue and it should be addressed through a network based utility service, as a standard of due care.