Of Signups and Logins
Ever wondered what happens when you create an account on a website?
Remember the first day you went to facebook.com? You might have seen something like this:
For the purpose of this article let’s simplify things a little bit and focus on the email and password fields. It’s reasonable to assume that when you signup, those 2 fields get saved somewhere in a database. So the next time you visit facebook.com, they ask for your email and password and then matches it with the ones in the database, verifies your identity and leads you to your profile. This works and pretty much serves the purpose. But it’s a horribly insecure implementation. To illustrate, lets take a look at a model of the database where emails and passwords get saved.
The problem here is that passwords are saved in plaintext, so anyone who has access to the database has login details of everyone on the site.
Let’s trust the developers of the site for the moment and assume they’re 100% honest and will never misuse the data. What would happen if an attacker gains access to the database?
He instantly gets all the info ever stored by every user registered in the site.
So what can we do to improve this? The answer is something called hash algorithms.
Hash algorithms are a type of one way functions. They turn any amount of data to a fixed length “fingerprint” that cannot be reversed. For example, let’s feed some data to SHA256, which is a standard and widely used hashing algorithm.
|Entire text of War and Peace||ac44f7eb6f2a0199f2109ec441f34a706a300fb3f528e36b538bd60ce7d94cbe|
Important things to note here
- a) Regardless of the length of input data, hash is always a constant length hexadecimal value.
- b) You cannot create the original text from its hash.
Now that we know a little bit about hash functions let’s see how we can apply this to improve security.
When a user inputs the password at the signup, we generate the hash of it and saves it instead of the password. When he tries to login later, we generate the hash of the entered password and matches it with the hash already in the database(We DO NOT save the password). If they match we let him into his account. Since we no longer save the plaintext password, even if the attacker gets access to the database he cannot use that data to login to other users accounts.
Now our database would look like this:
So everything’s fixed and we are done now right? Nope. Not so fast.
Even with the added security of the hash algorithms our system is far from accepted security standards. It is still prone to multiple types of attacks. To keep the article short let’s look at one of the widely used attacks. Namely use of Rainbow Tables.
Imagine our site has millions of registered users(as in facebook). Chances are that many of them could be using the same password.
You’ll be shocked by the sheer number of people have used “password123” as the password 😀
Note that people who have the same password will also have the same hash. So if an attacker somehow recovers the password from the hash he gains access to all those accounts with the same password. Now the question is how can he possibly recover the password from hash?
This is where Rainbow Tables comes in. They are huge lists of pre-computed hashes for commonly used passwords. (This is kind of similar to Rainbow pages, thus the name)
With Rainbow Tables in hand, the attacker just has to lookup the hash in the table, which will give the plaintext password. Sure this won’t always work. Unless the user has entered a randomly generated password (eg: EhwvZ&Edh4i^) there’s a high chance his password is in one of the Rainbow Tables; thus exposed to the attacker.
Now that we know hash algorithms are not sufficient enough, the next question will be What Can We Do To Improve This?
Answer Is Salting.
Rainbow tables work because each password is hashed in the same way. If two users have same password, they’ll have identical hashes. We can prevent this by randomizing hash. So even when the passwords are the same, hashes will not be the same.
Way to achieve this is adding a randomly generated string of text, called salt to the password before getting the hash when the user is signing up.
For example, if our password is “UCSC” our salted hashes would look like the following
|hash(“UCSC” + “jbfsdoidfkj”)||02a7b7c5e002be8637763e4d9340f010ccebe5d737e23444e0677eb8c67e5807|
|hash(“UCSC” + “souowhej”)||10c24179e40263f786b20bf9cb41dbb051eef2a18d386cbdd8b10d485d69d21d|
|hash(“UCSC” + “dfhiuhwef”)||b75874deff10d18f471f470c673e6aaa6bae99af4423cf20942863f1f69927ba|
We save the salts in our database along with the hashes. Now when a user tries to login,
we add the salt (from our database) to his password, run it through our hashing algorithm and if this hash matches with the one in the database we let the user proceed to his account.
Now our database will look like the following:
The most important thing to remember when generating salts is that they should be unique and random. There’s absolutely no point in using the same salt for every password because once the attacker identifies the salt, our hashes again become vulnerable to Rainbow Tables.
A common not-so-secure-but-good-enough way to generate salts is to use UUID (Universally Unique IDentifier). If you are using a Linux operating system enter “uuidgen” to the Terminal to get an UUID. You’ll notice that each time you enter “uuidgen” it returns a different string of text.
And that is what basically happens behind the signup/login page of almost any secure website.