The complete beginners guide to Cryptography (with pictures)
As a developer, you will always agree that nothing in this world (especially in tech) is 100% secure. Whether you are making the next fancy software or maybe the next big tech giant bugs and security issues will be a part of your journey.
Some problems will come that you will not be able to solve, rather you can only complicate it to protect your brand and users.
One such thing is protecting a password. And I am sure you have no solution to protect it rather than just making it complicated to read at least for humans. But if you make it unreadable then how do systems and servers verify the password, if it is right or wrong?
Now, here comes a concept called cryptography, it is just not another concept rather I would like to call it art. And the word Cryptography literally means the art of Secret Writing. So, today let's see How actually your Password is secured on your database and how servers decrypt it to verify.
Firstly, we will go through a little bit of what cryptography is, then we will dive deep into what it actually is and where and when it is used.
Purpose of Cryptography:
So if you're new to cryptography, let's talk about its purpose. When we talk about cryptography, what we're actually trying to get across or I should say its foundation comes from the fact that typically we have files and/or data, now if you want then we can to try to conceal that information by converting it using some type of key to make that particular data secret or more secure.
Now not only do we deal with data itself as a whole, but we can also do data as individual files, as well as even secure down workstations or servers by encrypting their drives. So what does cryptography bring to the table? Well, it brings first of all what we refer to as non-repudiation.
What we mean by this is that this makes it so that whoever has sent a message because we can encrypt an email or who has saved a document can't later deny having altered that document.
"It wasn't me, I swear. " No, it was you because it was encrypted with your key. It also brings to the table integrity. What we mean by this is that we have the ability to say the data actually came from the actual source that we are requesting it from.
So it's the trustworthiness of the data or resources in terms of data itself being improperly handled or any unauthorized changes. It also brings to the table authentication. Sometimes we visit websites where it's important that we authenticate that we are who we say we are, as well as we obviously want to authenticate our credit card information, right?
So it's definitely important to us within our industry. And, of course, confidentiality is another major factor as far as what cryptography can do for us.
As far as confidentiality is concerned, it's, again, we can only make certain resources available to authorized users. Now as far as how this process works or how cryptography works are on a very plain or very elementary level, the process goes like this.
It's that we have plaintext, text that is formatted, that we can read. Maybe it's a Word document, maybe it's an email. But it gets encrypted using an algorithm like DES or AES or even RSA.
After that encryption's been applied, we refer to that whole document now as ciphertext. It's completely unreadable. The file is then transmitted, and on the opposite end, it just goes through the opposite process, right?
The ciphertext uses the keys that it's aware of to do decryption to then obviously give us the plaintext that the original sender or the resource wanted us to see. See? I told you this would be easy. Next, let's talk about the types of cryptography.
Types of Cryptography:
Let's talk about the types of cryptography. There's not a whole lot talking to do because, guess what, they're only two. There's what they refer to as symmetric, as well as asymmetric.
And let me tell you the difference between the two. There're huge differences and there are some pros and cons.
When it comes to symmetric, some of the advantages that it has is it's very easy to use because both parties use the exact same key to encrypt and decrypt. You can see the flaw there, right?
It's also extremely fast because we're using the same key over and over. It's very easy to implement because, hey, I can come up with my secret key and distribute it to you. I can distribute it to my friends. I can distribute it to my family or even anyone I like.
But do you see the dilemma we have here? Once I start doing that, if I push out to 30 people, how secure is my key? Not very secure, is it? The key isn't transmitted with the data. That's one of the reasons why it's really, really fast.
All you need to do is have a copy of the key on your system. Now here're some of the downsides. Well, if the key isn't transmitted with the data, the downside is that you've got to distribute it to your parties that you want to have access to the resource.
Obviously, because we're distributing these manually, the manageability of these keys is extremely rough, as well as because the keys are rotating, there's nothing to change.
As a matter of fact, if I change my key, I have to update everybody. So I have to watch out for dictionary text because these keys have a tendency to be static. And because I may be distributing my keys to several people, it's really, really hard to prove authenticity because anybody can make a change to the resource or the data.
"But, what's a better solution? " Well, the better solution is actually asymmetric.
Asymmetric is much better because of, one, it uses what they refer to as a two-key infrastructure. It's also known as public-key cryptography.
We actually have two keys. One's public, one's private. The private key we don't share with anybody except for yourself. You keep it. It's typically a part of your system or stored specifically for your account.
The public key, we can give it out all we want, put it on Facebook if you want, tweet it out if you'd like because the public key is used for encrypting the data, but the private key is the only key that can decrypt it.
So guess what? Yeah, it's a lot more convenient because I can issue out public keys like crazy. I don't care if people have it. Also, it does require or utilizes what they refer to as digital signatures.
Also, another advantage is its better security. If nobody has a copy of my private key, then they're not getting a hold of the data. Now it does have some cons to it. It is more intensive as far as CPU utilization because it has to do that encryption and decryption. But as far as slower is concerned, we're talking about a half an eyeblink versus a full eyeblink in speeds.
It also has huge issues if the private key is exposed because it's the master key, right? In fact, if the private key gets exposed, we need to regenerate a key. And so we're going to have some type of management solution in place so we can revoke keys if we need to.
When it comes to asymmetric, it is susceptible to man-in-the-middle attacks, as well as brute force attacks. And, unfortunately, if the key, the private key gets lost, you can't decrypt. And the data is no longer valid.
We can surely talk more about it some other day, for now knowing this much is enough for what Symmetric and Asymmetric encryption are.
Let's talk about ciphers, no, not sippers, not chippers, not kippers. It's pronounced "sai·fuh".
When it comes to ciphers, what we're talking about here is actually an algorithm, or if you want to think of it as a defined series of steps that have to be performed for encryption and decryption.
Once a message or data has been encrypted using a cipher, it is considered unreadable unless, of course, whoever's receiving the data or message knows the secret key that's required to decrypt it.
Now, believe it or not, ciphers are used everywhere, not just in our emails and our data itself, but we also use it in other communication technologies like cell phones. This is what makes our cell phone calls more secure.
Now I know what you're thinking. "Is there only one cipher? " No, they're actually several different types of ciphers. We can actually categorize them into two different types, either classical or modern.
I know, it sounds like we're going to talk about art, but we're not. Let's first take a look at classical. When it comes to classical ciphers, there are two common ones.
The first is just referred to as a substitution cipher. Now, this is basically where a user replaces plaintext or the lettering of the plaintext with ciphertext. Examples of Classical chipers are Ceaser Chiper, ROT13, Affine Cipher, Simple Substitution Cipher, etc...
The second type of chipers include modern ciphers, they are a little bit complex and difficult to crack. And they are here to provide authenticity, security as well as the integrity of the sender. Modern chiper include Symmetric Chipers, Asymmetric Chipers, Block Chipers, Stream Chipers, etc...
I guess we can talk about Chipers on another day for now for the basics this is enough.
Let's take a look at some other algorithms. When it comes to some different algorithms that are out there or some additional ones, the most common that we will see are going to be:
- Triple DES or 3DES
Let's take a look at DES. Now DES is actually short for Data Encryption Standard. This bad boy was created back in the 1970s by IBM. It utilizes a 64-bit block. On top of that, it only uses a 56-bit key for encryption and decryption, which is not that great.
In fact, in 1999, this particular algorithm was cracked. What's interesting is it was cracked back in 1999. You've got to think about, first of all, were you born, as well as what was the computing power that we had back then? Not very much.
But DES itself actually provided up to 72 quadrillion possible encryption keys. And we were able to crack it.
Now obviously to cover this particular vulnerability, which could obviously be brute force attacked very easily, we came up with triple DES, or some people might call it 3DES. And if you can't figure it out, the reason why it's called triple-DES is that it's DES times three.
So what it does is it goes through and does the DES algorithm three times with three different keys. The triple-DES uses what they refer to as a key bundle, which consists of keys one, two, and three, or K1, 2, and 3.
Each key is a 56-bit DES key. And each key is utilized like this. First of all, DES encrypts using K1. DES decrypts with K2. Then DES encrypts again with K3. Now as far as how these keys are used, you have a couple of different options.
The first option is that all three of the keys are completely different or independent. The other option is K1 and K3. Since they're both used for encryption, they are identical.
And, actually, you know what, there's actually a third one, which is where all three keys are the same, but that's not very secure. The most secure would be the first option, which is where all three keys are independent.
Next up on our list is AES. AES, which stands for Advanced Encryption Standards, is actually kind of a grownup compared to DES and triple DES.
It utilizes 128-bit block size, and its keys are actually 128, 192, or 256 bit in size. I bet you can't guess what they call each one of those, right? AES 128, AES 192, and AES 256.
Now AES is a symmetric key algorithm, those created with the help of the National Institute of Standards and Technology, and as far as government use, governments typically will use this for encrypting data that is considered unclassified.
We then have what they refer to as RC4. RC4, which is actually short for Rivest Cipher 4. Some people call it Ron Rivest.
Ron is the first name of the gentleman that helped create it, but it's not. It's actually Rivest Cipher 4. Now, this particular cipher is considered a variable key size symmetric key stream cipher. And let's be honest, it's not that great because, guess what, we actually used to use this in WEP, yeah, WEP, which is the wireless security protocol that's been hacked very, very easily.
And the reason why RC4 was actually kind of open for different types of attacks is because, it had what they refer to as a bias output, and what they mean by that is because, well, this is one of the reasons is that if the third byte of the original state is 0, and the second byte is not equal to 2, then the second output byte is always 0.
Now that may seem complicated, but trust me, once we see patterns like this, we're actually able to go through and reverse engineer the particular cipher and crack it. And that's what happened to WEP. It wasn't until we came out or started using WPA2, which actually uses AES, that things got a little bit more secure in our wireless environment.
Blowfish & Twofish:
Two other algorithms that we should be familiar with, the first being what they refer to as blowfish. The blowfish algorithm was actually developed back in 1993, and I know you're thinking, "That's a long time ago. " But it's actually quite a strong symmetric block cipher that we still use today. It does use the same key to encrypt and decrypt.
It utilizes a 64-bit block, and its key is a variable key anywhere from 32 to 448 bits. This particular protocol was actually designed to replace DES and triple DES. Now it may be surprising to know that a lot of people still enjoy using AES, and I guess there's nothing really wrong with it, but just you need to know that one of the advantages of blowfish is its speed because the fact that it breaks things down into the 64-bit block versus AES, which is using 128-bit block allows for faster encryption and decryption.
Now we also have something called Twofish. Now Twofish was actually created a few years after blowfish. It was introduced to us back in 1998, and it's relatively close to the blowfish cipher.
That's why we have a similar name here, but it encrypts data at a 128-bit block just like AES. In fact, Twofish is very similar to AES because it supports a 256-bit, as well as a 192 or uses a single key. Now another reason why both these particular ciphers are extremely popular is because they haven't been patented.
They're open. In fact, Twofish was the algorithm that was one of the top five finalists to replace DES for the US government.
Standards and Protocols:
Let's now talk about standards and protocols. One of the most common standards that we see out there is something referred to as DSA. DSA is short for the Digital Signature Algorithm. That makes sense, right? It's actually a federal information processing standard for creating digital signatures.
Now how this signature standard works is it actually creates a 320-bit digital signature, but the signature is also accompanied with anything from 512- to our private and public key technology or process. Again, a private key is going to be used by the person signing the document or the information, and the public key is actually utilized then to make sure that the signature is unique or actually from the source that it says it's from.
Now another standard or signature that's out there is referred to as RSA. RSA are the initials of the creators. Hey, you know what, the R is actually the same Ron Rivest from RC4, as well as two other gentlemen, let's see, the last name is Shamir and the other gentleman is Alderman. Now, this also uses a public key encryption system, and it uses two large prime numbers as its basis.
Now we see RSA as a standard in a lot of our operating systems today. Microsoft, Apple, Novell, I think they still have an operating system, Sun Microsystems, as well as being utilized in networking cards, smartcards, even what we refer to as hardware secured phones. Now, this may make your brain hurt a little bit(that's why I have included some videos for you to watch), but let me give you a brief synopsis of how this actually works.
First of all, two large prime numbers are actually taken, and we're going to call them a and b. These are the two prime numbers, and their product is determined by obviously saying that c is equal to a times b, and c they refer to is the modulus.
RSA then chooses another number so that it is less than c and relatively prime to a-1 times b-1. And what this does is it makes sure that e and a-1 times b-1 have no common factors with each other except for, obviously, 1.
I know, it keeps getting better because then RSA chooses another number, f, and it does it so that ef-1 is divisible by a-1 times b-1. Now some of you algebra folks are following right along.
But wait, there's more. The values of e and f are referred to as the public and private exponents. In using those exponents, the public key is paired, c, and e, to create that public key and the private key is created by pairing c and f together.
Now the reason why this is so secure or extremely secure is that it's considered to be extremely difficult to obtain the private key, c, and f, from the public key, c and e. Now, obviously, if someone is able to factor c into a and b, then that person could actually decipher the private key, but the odds of somebody trying to get a hold of those two prime numbers that are taken by random.
So, again, a lot more secure for us. Now that's just a highlight of RSA and how it operates. If you want to get more detailed into it, I'd actually recommend jumping online and getting more depth into RSA.
And then we also have what they refer to as the Diffie-Hellman protocol. It's a cryptography protocol that is or creates a method of securing the exchange of your cryptography keys over a public channel. It was actually one of the first public-key protocols that were created back in.
There is another one called ECC, which stands for Elliptic Curve Cryptography but it is too complicated to talk here if you want to know more about it here is an awesome article by Cloudflare.
Let's talk about hashes. So hashes or when it comes to hashes, what we're trying to do here is we're going to come up with a way to verify that a particular document hasn't changed. And how we do that is we take the document, and we assign what they refer to as a digital signature associated with it.
And that signature actually is represented by a hexadecimal code that basically says that the document, the default for this document or even an executable or an email, if it is equal to this number, then we know that it's legit, and it's been signed, and nobody has changed it.
But if somehow the document or the executable is changed, that hash value won't work or it won't calculate out correctly.
There are different kinds of Hashes like:
- Keccak and SHA-3
But unfortunately, we are not going to talk about them here, rather we will discuss in-depth about these any other day.
Hashes designed for Messages:
So now that we've got our hands on these different types of ciphers, there are some that are, in particular, designed for messages.
We first start off with what they refer to as HMAC, which is short for Hash-based Message Authentication Code. Now, this particular algorithm includes the embedding of the hash functions like SHA-1 or MD5.
Now where its strength comes from is dependent upon the embedded hash function, key size, and the size of the hash output. Now how HMAC operates is there're actually two different stages. Now in those stages, there's a secret key that's created.
There's one for an inner and one for an outer. The first pass of the algorithm actually produces an internal hash derived from the message and the inner key. The second pass includes the final HMAC code derived from the inner hash result and the outer key.
Now HMAC itself doesn't actually encrypt the message. But what it does do is it takes the message whether it's encrypted or not, and it has to be sent alongside the HMAC hash. This way here both parties can verify using the secret key that the message is coming from who it says it's coming from, or its authenticity is correct.
CHAP, which is short for Change Handshake Authentication Protocol, is an authentication mechanism that we use via PPP and a three-way handshake.
Now one of the cool things about CHAP is it actually provides a way to protect yourself against what we refer to as replay attacks. Now the downside is that CHAP uses that shared key thing where both the client and the server have to know the plaintext of the secret.
Now the funny thing here is Microsoft actually came out with their own version of CHAP that was called MS-CHAP that doesn't require either the sender or the receiver to know the plaintext, and it doesn't transmit it, but it's also been hacked.
Next, we have EAP, which is short for Extensible Authentication Protocol. This particular protocol was originally designed for point-to-point communications. It is actually used as an alternative to CHAP, as well as PAP.
I didn't put PAP up here because it's so outdated. But EAP itself is actually more secure and supports different authentication mechanisms such as using either one-time passwords, passwords, just standard passwords, as well as smart tokens. So, for example, a security card or a digital certificate or even public-key encryption could be utilized as well.
So in this article, we went through and kind of got an introduction to cryptography, right? We talked about the purpose of cryptography. We also went through and talked about the different types of cryptography that are out there. We also talked about ciphers or sippers or siffers and these are, again, just different types of algorithms that are using cryptography. Some of them are old school. Some are more up to date, give you better security. We then talked about DES and AES. When we talked about DES, we also talked about triple DES and how there're some issues with those. Then, of course, with AES, it's one of our more secured standards that are out there. We then went through and talked about some of the other algorithms that are available to us. We talked about RC4, as well as blowfish and twofish. Then we got into some of the standards and protocols, things like the digital signatures themselves, DSA, as well as RSA. And, of course, we then also talked about the Diffie-Hellman protocol. And, of course, we then got into hashes and what hashes are or what they represent for us. So that's it for today. I hope we will meet again with another article. Till then have a nice day...