The art and science of password hashing
The recent FlipBoard breach shines a spotlight again on password security and the need for organizations to be more vigilant. Password storage is a critical area where companies must take steps to ensure they don’t leave themselves and their customer data vulnerable.
Storing passwords in plaintext is recognized as a major cybersecurity blunder. Despite this, many companies, including Facebook and Google, have committed this faux pas. When hackers gain access to a plaintext password database, they then have access to all the user accounts in that system, but often, due to the reuse of passwords, it can create a breach domino effect for other organizations.
Why password hashing is essential
Password hashing, where companies encode passwords using a mathematical algorithm, has long been touted as the answer to this problem. Hashing is a one-way cryptographic transformation on a password, turning it into another string, called the hashed password.
When a user chooses a new password, the password is passed through a chosen hash algorithm that performs a mathematical transformation on it, creating a hash value. This hash value is typically represented in hexadecimal format.
This hash is the only thing that is stored for the user’s password. Since the hash algorithm only works in one direction, it’s infeasible to back out the original password using just the hash value (there are other ways to deduce the original password from the hash, but more on that in a minute).
The general idea is that storing hashes rather than plaintext password significantly reduces the possibility that a hacker could retrieve all of the passwords in the system—even if they gain access to the database.
Later when the user logs in and we must verify that the user entered the correct password, the same process is performed again: the entered password is hashed using the same algorithm and the hash is compared to the stored one. If they match, the user is allowed access.
It’s critical to understand the different approaches to password hashing as all hashing algorithms are not created equal.
Hashing 101
Hashing algorithms take an input of any length and return an output of fixed length. This output will look nothing like the original password. While it may seem like the algorithm is pumping out a random number, it is actually a deterministic process. A mathematical formula and hashing table decide which symbols in the input data will become which symbols in the output data. Hackers cannot directly turn a hashed value into the password, but they can determine what the password is if they continually generate hashes from passwords until they find one that matches. This is referred to as a brute-force attack.
With enough time and access to hashing tables, a hacker could reverse engineer a password, which is where salting comes in. For example, they will know the hashes for the most common passwords such as “Password1” or “qwerty,” by using a rainbow table which displays common passwords and their corresponding hashes. This means that, without any additional security, if you input the same password you would see the same output every time.
To protect passwords further, some random characters, called salts, are added to the end of the user’s password, therefore producing a completely different hashing output.
Hashing algorithms
SHA-256 hash – With cryptographic hashing algorithms, similar inputs produce vastly different outputs. Using the SHA-256 hash generator creates an entirely different hashed output even if only one character is changed. This makes it much more difficult for hackers to reverse engineer the input values from the output values. As a result, SHA-256 is the hashing algorithm with Bitcoin cryptocurrency.
MD5 (Message Digest Algorithm) – MD5 is a cryptographic algorithm that will always produce an output of 128 bits (typically expressed as a 32 digit hexadecimal number) no matter the length of the input. It was one of the most widely used hashing algorithms but is now no longer recommended. MD5 is not collision resistant, meaning it’s possible to produce the same hash with different inputs, which makes it a poor cryptographic hashing function.
MD5’s downfall when it comes to passwords was that it was too fast and too popular. As a result, brute force attacks are more likely to be successful due to the thousands of inputs tested, and the popularity of the function makes it attractive to hackers. Today you can find the input to a MD5 hash in seconds by Googling it. Since many businesses already use MD5, they have taken to adding salt to it, creating a salted MD5 output.
MD5Crypt – MD5Crypt added extra functionality to MD5 to make it more resistant to brute force attacks. However, in 2012, the author of MD5Crypt, Poul-Hennin Kamp, declared it as insecure due to the speed of modern hardware.
SHA-1 – SHA-1 suffers from many of the same problems as MD5; it’s very fast, it’s also experienced collision attacks, and is now considered unsafe. Faster computations result in faster brute force attacks, making SHA-1 inherently insecure for storing passwords.
BCrypt – Unlike SHA-1 and MD5, Bcrypt is intentionally slow, which is a good thing when it comes to password security as it limits the attacker’s ability to perform successful brute force attacks. A key aspect of hashing is that it should be a one-way form of encryption. It should be easy to go from the input to the output, but infeasible to find the input from the output. This slowed down hashing function makes cracking the hashes more impervious because it is time-consuming and uses a lot of computing power.
Companies must remain vigilant so that their customer data is not vulnerable. With hashing, there are many different options available; however, it’s vital to recognize that not all hashing algorithms are equal. Some can be cracked with very little time and effort, others require a lot more energy and time to crack.
Hashing is a critical component of password security, but it requires a nuanced approach to protect customer data. Organizations must ensure that their password hashing strategy utilizes robust, modern algorithms that make it almost impossible for hackers to reverse the hashing and read passwords in plain text. By taking a proactive approach companies can reduce the risk of breaches and hackers gaining access to valuable customer data.