Basic concepts to understand encryption

Mateo mojica
6 min readMar 8, 2024
Photo by kili wei on Unsplash

Encoding and encryption are used every day in communications and security for applications, In this article, we are going to learn basic concepts of those techniques and how they are implemented to power modern life.

Encoding is the process of changing the data predictably and reversibly, this helps to escape special characters in a text or replace not permitted characters(like spaces) in a URL by replacing them with a representation that can be sent over the request. The encoding process does not encrypt the data and doesn’t make the transmitted information secure, it is just a change in format due to a technical limitation but can be reversed by anyone since the decoding algorithm is known and doesn’t require any type of secret information that only a few parties know about. Some of the most common encoding methods are URL encoding and Base64.

Hashing on the other hand is a process where the data is digested and then used to create a string that is unique to that data, but the data cannot be retrieved from the hash. Hashes are very convenient because one of their properties is that the same data produces the same hash and the size of the data is not an issue because what determines the length of the hash is the algorithm used to create it. This makes this process ideal for checking the integrity of the data, in other words, to ensure the data has not been tampered with. A widespread way to use hashes is for the checksum on downloaded files, with it, you can make sure your downloaded data is the same as the original and nothing got corrupted. Another very common application for hashes is to store passwords in a database, this is the ideal way to store them since if someone gets access to the database they can’t retrieve the password for the users. The most used hashing algorithms are the SHA family(128, 256) and MD5.

Something that is used along hashes is salts. Salting the data is adding a known string to the beginning or the end of the data to change it a little bit before hashing it so they can’t use a hash table attack of known hashes when a breach happens.

Photo by saeed karimi on Unsplash

Now let’s talk about how your data is transmitted securely over the internet, for this a process called encryption is used. In this process, the data is changed using sets of keys that are secret or restricted, and the same keys are used to recover the original data (decryption process), without the keys the data can’t be recovered, this makes the keys an integral part of the process and they can’t be exposed because it will render the encryption useless. There are two ways of encrypting data depending on how the keys are used. We are going to take a look at both.

The first method is symmetric encryption, this method uses only one key called the private key to do both the encryption and decryption of data, meaning the key has to be shared among all the interested parties so they can talk to each other. It works efficiently with large amounts of data and the longer the key you use more secure your encryption is, the only problem this type of encryption poses is that there is no good way of distributing the private key and that makes it very hard to implement. Some algorithms that are used for this kind of encryption are AES(128, 256), DES, and Blowfish.

To implement encryption at scale they had to solve the problem of key distribution in symmetric encryption, for this asymmetric encryption was created. In this type of encryption, there is a set of two keys, one public and one private, the public key is the one that gets distributed among the users of the system and the private key is kept with the owner in a safe place. A trust relationship is established when public keys are exchanged between two parties, in other words, server1 has the public key from server2 and vice versa, in this order of ideas, you have to have a public key for each server you want to communicate with. The public key is used to encrypt the data and that data can only be decrypted by the private key, that is what keys have to be exchanged for the communication to work, let’s visualize it with a quick example, if server1 wants to send a message to server2 it has to encrypt the data with server2’s public key so they can recover it and when server2 wants to respond it has to use server1’s public key to encrypt the data.

The way these public keys are distributed among users is through certificates issued by a certificate authority, this authority verifies the information provided by the user and creates a certificate that contains all the information regarding the owner of the key, an expiration date (generally 365 days) and the public key itself.

You can also use the key pair to sign the data, in this flow the signature is encrypted using the private key and it can be verified by the public key, later on, we are going to talk about digital signatures and how they work.

As always there is a catch with this kind of encryption and it is a deal breaker, the size of the data can’t be bigger than the size of the key, which means that it is very inefficient with large amounts of data.

Photo by Immo Wegmann on Unsplash

To solve the new issues they created a new approach to transmitting encrypted data called hybrid encryption. The gist of it is that it uses the best of both methods, it uses symmetric encryption that doesn’t have the data size restriction to encrypt the data with a master key but also uses asymmetric encryption to share this key between users by encrypting the key with the public key, which is a lot smaller than any data that can be sent over, and then sending it to be decrypted and stored by the other user. This is the method that is used in most communications today.

Now let’s talk about something that we touched previously, digital signatures. These signatures are used to ensure that the data is the same as the original, kind of a checksum for the data. It works by creating a hash of the data and then encrypting it using the private key and adding it to the data, that hash can be verified by the receiver using the sender’s public key. In this process the receiver decrypts the data and creates a hash of the data, then decrypts the signature using the sender’s public key and compares the two hashes, if they are not the same it means the data has been tampered with, hence the name digital signature. It used to be used on sensitive information but now is being used more and more in other kinds of applications.

Thank you for reading this article and I hope you leave with a little more knowledge on this topic and that it sparked the desire to learn more about it. If you liked it give it a clap and follow me for more content about different aspects of development.

--

--