Why cryptography is much harder than software engineers think
The recent ROCA vulnerability (CVE-2017-15361) raises some important issues about the design of secure cryptographic software. The vulnerability is not in this case an obvious coding error such as a buffer overflow, or the use of a poor quality random number generator.
In this case, it arose from what probably seemed like a reasonable software engineering decision. To understand this in detail requires some pretty complex mathematics. For that, I refer you to the paper that details the flaw along with the exploit, which you can download here.
In summary, the researchers studied the statistical properties of a large sample of public keys. These are not normally easy to obtain, but the Estonian government had set up a public directory, associated with their national ID card. Since, by definition, these are public keys that’s a perfectly reasonable thing to do. Recall that there is a corresponding private key which is of course not disclosed. In theory, it’s almost impossible to derive the private key from the public key unless enormous amounts of computer time are expended.
Researchers analyzed the statistical properties of these public keys. They found that the keys were not truly random, as they should be. This meant that it was possible to derive the private key from the public key in days, rather than the expected thousands of years.
Prime suspects
How did these weak keys come to be generated? The issue lies with the RSA algorithm which lies at the heart of public key cryptography. Recall that the public and private keys are generated from very large prime numbers. Five is a prime number (it can be divided only by itself and 1). Six is not (it can be divided by 1,2,3 and 6).
The RSA algorithm best practice implementation requires that these primes (which can contain thousands of digits) have certain additional properties, which means that not just ANY large prime number will do.
The tests required to establish the suitability of the primes can be computationally expensive. Consequently some shortcuts can be taken to speed the process up. In the case of the Infineon library associated with the vulnerability, these ‘optimized’ tests favored the selection of certain prime numbers, whose digits contained patterns that the researchers could pick up through their statistical analysis.
Therefore the primes chosen to create the public/private key pairs were chosen from a much smaller set of primes where the efficient test could quickly determine that they were suitable. They were still huge primes, of course, but not truly random primes.
The major key
The researchers could exploit this lack of true randomness. Because they had inferred the patterns within the primes, they could use this knowledge, along with attack known as ‘Coppersmith’s algorithm’ to efficiently derive the private key from the public key.
Infineon are highly unlikely to be alone in exploiting this technique for efficiency. The researchers examined some other publicly available databases of public keys. This is what the researchers said about keys generated by Trusted Platform Modules (TPMs). A TPM is a hardware device used to generate and manage keys securely.
We analyzed a sample of 41 different laptop models equipped with TPM chips…. All chips [from]… 2013 or later were vulnerable, including both TPM 1.2 and TPM 2.0.
TPM devices have very limited computational resources. Consequently, the code they use to generate and test random primes must be heavily optimized. This matters because the keys generated by TPMs are often used to support full disk encryption systems such as Microsoft’s BitLocker. If the keys are weak, an attacker can potentially recover the encrypted data.
The researchers point out that obtaining large-scale databases of public keys is not easy. Consequently there could be a lot of weak keys out there that have not yet been identified. Recall that Infineon used a particular ‘shortcut’ to optimize the tests for candidate primes. Other vendors may have used different shortcuts that expose their generated keys to a similar weakness.
Without a database of public keys all generated by the same algorithm, it’s hard to tell whether a specific key is weak. The researchers have a test for Infineon-generated keys, but for other vendors, presumably with different algorithmic shortcuts, the test would differ. This may therefore be a more widespread problem.
Most software engineers are not well-versed on cryptographic design, which requires strengths in both math and statistics. In writing the code that introduced the flaw, the engineers probably took some existing implementation and adapted it, based on reasonable considerations of efficiency – they were constrained for resource and made what seemed like a sound decision. Unfortunately, in doing so they introduced a subtle but critical flaw.
The challenge of implementation
It would have been possible to detect this through automated tests against a large set of generated keys. There are established algorithms for testing how truly random a set of supposedly random numbers actually is. Keys produced by this flawed algorithm should have failed this test.
But this vulnerability highlights how hard cryptography is to implement securely. There is no simple test to say that an implementation is intrinsically secure. Instead, security must be implemented by design. Open source implementations such as OpenSSL have the huge advantage that the code base can be publicly audited by security professionals. Indeed, as far as we know, keys generated by OpenSSL are cryptographically secure. However, closed platform implementations such as the Infineon TPM code are not auditable.
Note that the researchers did NOT reverse-engineer the Infineon devices. They merely examined the generated public keys for statistical anomalies and from these were able to infer the cryptographic weakness. Had the code been in the public domain this weakness may well have been discovered through code auditing. So – if your security depends on vendor-supplied ‘black boxes’ – be very careful. As this incident shows, security through obscurity is no security at all.