Both Encryption and Hashing are fundamental building blocks of cryptosystems. When it comes to best practices for storing credentials in your application however, best practice is largely driven by what you’re trying to do. There are a lot of well-meaning security professionals who elect an extremely dogmatic stance: “Encrypting passwords is bad! You must hash them”. This is usually true, except when it’s not. We do not need a generation of security practitioners inflexibly making demands from the ivory tower. We need professionals that understand these concepts and are able to make prudent recommendations.
- What is Encryption?
- What is Hashing?
- What about client-side hashing?
- Salting and Peppering
- Other Scenarios
What is Encryption?
Encryption is the act of transforming “plaintext” (your original message) into “ciphertext” (a transformation of your original message which can only be decoded by the intended recipient). There are two primary kinds of encryption: symmetric and asymmetric. Symmetric encryption is wicked fast, but utilizes a shared “key” which must be exchanged beforehand by the parties involved. Additionally, symmetric encryption can’t uniquely distinguish between the parties conversing, since they all share the same key! Asymmetric encryption on the other hand relies upon a fundamental relationship between two keys per party, a “private key” which is known only to the person who generated it, and a “public key” which is intrinsically and mathematically linked to the particular private key. What is encrypted with the public key can be decrypted only by the private key, and vice versa! (It is this inversion of asymmetric cryptography which is used together which hashing in order to create digital signatures).
What is Hashing?
Hashing is the act of running a message through a “one-way function”. A one-way function simply describes any process that is easy to compute an output, but computationally difficult to reverse-engineer the original input based on the output alone. We bend this fundamental property to our will in order to build complex cryptosystems. Note that one-way functions do not necessarily produce a unique output. (Two inputs with the same output are referred to as a collision, and the ability to reliably generate them based on arbitrary input makes a particular hashing algorithm unsuitable for cryptographic purposes. This is the fundamental reason why SHA-1 and MD5 are considered “broken” today.
When authenticating users to your application, the old wisdom is undeniably correct. Storing user’s passwords in your database is a recipe for disaster. Should your database be compromised, that user’s password could be used to authenticate in their context to any service where their password is re-used. While at first blush it may seem that the answer is to encrypt the user’s password and then decrypt it prior to comparing it against what the user sends, this too is fraught with danger. Your web application has access to some decryption key, so while attacks based on say, SQL injection would be thwarted by this attempt, an attacker who has performed a buffer overflow and can execute arbitrary code in the context of your application could simply exfiltrate both the database records from your database server and the decryption key from your server’s file system! Hashing a password allows for your user’s plaintext password to only exist in main memory (RAM) and never make its way to persistent storage.
What about client-side hashing?
“Wait a second”, astute readers will ask to themselves, likely scratching their heads. “Why send the password at all? Couldn’t the client be responsible for hashing the password instead? Then it wouldn’t even need to live in main memory on the server!” At this point though, the hashed value of the plaintext password becomes the plaintext password from the view of the server! For it to store that value unaltered is no better than storing plaintext passwords. For it to hash it again serves no value.
Salting and Peppering
If web applications only had a single user, the conversation would be over. However, that is not the world in which we live. On a multi-user system which hashes passwords, it turns out that even password hashes can provide value to an attacker who has exfiltrated a copy of the database table containing those hashes. If two values match, it is almost a certitude (except in the case of a collision!) that the users have the same password. This would make this hash a high value target for comparing against “rainbow tables”. “Rainbow tables” are an example of the classical “time-space” tradeoff playing out in the real world. Rainbow tables consist of “pre-computation” making reversing a computed hash a much easier affair, but only in the cases that the rainbow table considered the original input! A password salt works by concatenating a random string (usually stored alone with the password) to the end of the user-supplied password before hashing. This insures that no two hash inputs are the same! “Peppering” your hash works much the same way, except the value is treated as a separate secret, and ideally stored in hardware such as an HSM.
What happens though if you’re NOT authenticating a user against your service? Many applications must handle application secrets in some manner. Some of these may be user-generated, some of them may be used for the application itself to authenticate to other resources, etc. Take the case of a system acting as a password manager. It would not be suitable to hash passwords, because hashing is not a reversible operation! The system could not accomplish its sole purpose. In other cases, a system may need to authenticate to a third party in the context of a user. While an argument could be made that ideally all systems would participate in federated authentication through a secure protocol like SAML, WS-FED or OAUTH which are made to accommodate this use case, this is not always feasible. A popular example of a service which requires a user’s password to another service is Mint.com, a popular personal finance management tool. Mint does take advantage of federated walkthroughs with financial institutions which support it, but for those that don’t, there is no other option to provide the service than storing user’s passwords in a reversibly encrypted manner. A particularly dogmatic practitioner might insist that this in unacceptable, and the best approach would be for Mint.com to not exist as a service at all, or to only participate with banking institutions supporting this kind of federated access. They would be missing the forest for the trees and likely not succeed in meeting the needs of a business.
Both hashing and encryption are incredibly useful tools. Both have their place when manipulating user passwords. These concepts are so incredibly relevant and fundamental that they should be well understood by everyone in your technical organization from the CIO to the project manager. Being able to openly discuss and critically examine the purpose of why your application is doing what it’s doing is a one of the most difficult yet the most valuable competencies a technical person can possess.