What is hashing? (With examples)

Hashing is the practice of transforming a given key or string of characters into another value for the purpose of security. Although the terms “hashing” and “encryption” can be used interchangeably, hashing is always used for the purposes of one-way encryption, and hashed values are very difficult to decode. Encryption always provides a decryption key, while encrypted information cannot be easily decoded and is intended to be used as a method to validate the integrity of an object or piece of data.

What is hashing?

Hashing is the practice of transforming a given key or string of characters into another value for the purpose of security. Unlike standard encryption, hashing is always used for one-way encryption, and hashed values are very difficult to decode.

More from Katlyn Gallo5 Ways to Combat Alert Fatigue in Your Security Program

Hashing algorithms and security explained. | Video: Computerphile

What is hashing used for?

Hashing is mainly used for security purposes, and specifically those in cyber security. A hashed value has many uses, but it is primarily intended to encode a plain text value so that the enclosed information cannot be exposed. The hashing process is non-reversible or extremely difficult to decode, making it often used as a cryptography technique.

Some of the most common applications of hashing in cybersecurity are:

Message integrity File integrity Password validation Blockchain and transaction validation

Each of these use cases relies on the core function of hashing: to prevent tampering or tampering with information or a file.

Find out who is renting.

See everything Developer + Engineer jobs at top tech companies and startups

What is hashing in data structure?

Hashing in data structure refers to the use of a hash function to map a key to a given index, which represents the location of where a key’s value, or hash value, is stored. Indexes and values are stored in a hash table (or hash map) data structure, which is similar in format to an array. In hash tables, each index corresponds to a specific key value, and is organized to help quickly retrieve key-value pair data and their elements.

What is a Hash Collision?

A hash collision is when two different keys generate the same index and key value. Collisions can happen if there are more keys to hash than there are value slots available in a database. To resolve hash collisions, methods known as collision resolutions are used, with the most common methods being open addressing (closed hashing) and separate chaining (open hashing).

In open addressing, all keys and values are stored directly in the same hash table, so there remains an equal number of key and value slots and no overlap occurs. To achieve this, linear bearing, quadratic bearing or dual bearing are used. With linear and quadratic probing, slots in a hash table are “searched” or looked through until an empty slot is found to store the conflicting key value. With double hashing, two hash functions are applied, where the second function offsets the conflicting key value and shifts until an empty slot is found.

In separate chaining, a slot in a hash table will act as a linked list, or chain. By doing this, one slot and index will then be able to hold multiple key values if a collision occurs. However, each index will have its own separate linked list in separate chain, which means more storage is required for this method.

Hashing and Message Integrity

The integrity of an email relies on a one-way hash function, typically referred to as a digital signature, applied by the sender. Digital signatures provide message integrity via a public/private key pair and the use of a hashing algorithm.

To digitally sign an email, the message is encrypted with a one-way hash function and then signed with the sender’s private key. Upon receipt, the message is decrypted using the sender’s public key, and the same hashing algorithm is applied. The result is then compared to the initial hash value to confirm that it matches. A match value ensures that the message has not been tampered with, while a mismatch indicates that the recipient can no longer trust the integrity of the message.

Image: Screenshot

Hash and file integrity

Hashing works in a similar way for file integrity. Often technology vendors provide publicly available downloads referred to as checksums. Checksums confirm that a file or program has not been modified during transmission, usually a download from a server to your local client.

Checksums are commonly used in the IT field when professionals download operating system images or software to be installed on one or more systems. To confirm that they have downloaded a secure version of the file, the individual will compare the checksum of the downloaded version with the checksum listed on the vendor’s website. If the two values match, the file is trusted. If they do not match, the file may not be safe and should not be used.

As with digital signatures, a checksum is the output of a hashing algorithm’s application to a piece of data, in this case, a file or program. Checksums are common in the technology industry to verify files, but are also how security vendors track the reputation of files. As such, the checksums, or hash values, of malicious files are stored in security databases, creating a library of known bad files. Once a piece of malware is tagged in a reputation database and that information is shared between vendors in the industry, it is more difficult for the malicious file to successfully download or run on a protected system.

Hash and password validation

Contrary to what many people may believe, when you enter your password to sign in to a device or account, the system does not directly confirm your password. Instead, it hashes what you entered and then compares it to the stored hash value that the system or back-end database has.

Historically, and unfortunately in some cases today, passwords were stored in plain text. This meant that the system or back-end server of the website you logged into stored the plain text value of your password in a file or database. As computers became common household items and the rise of the Internet led to more online activity, security researchers quickly realized that plain text passwords would not suffice when it came to information privacy and protection.

Today, most systems store hashed values of your password within their databases so that when you authenticate, the system has a way to validate your identity against an encrypted version of your password.

For added security, some systems (for example, Linux-based) add a salt, which is a 32-character string, to the end of the password before hashing it. This step prevents two of the same hashes from occurring as a result of two people having the same password, such as “Pa$$word123.” By adding a unique salt to each, it is impossible for the two hash values to be the same. Salting passwords also makes them much more difficult to crack, which is valuable in the event of a data breach.

A diagram illustrating a hashed password — Image: Screenshot

Hashing and Blockchain

Blockchain is a modern technology that enables efficient and immutable transactions. It now has many uses, including cryptocurrency, NFT marketplaces, international payments, and more. Blockchains work in a peer-to-peer fashion where the transactions are recorded and shared across all computers in the blockchain network. But how exactly can transactions be made immutable? By cryptographic hashing, of course.

Hashing within a blockchain works the same way as for the other use cases discussed above: A hash function is applied to a block of data to provide a hash value. The difference in using them within a blockchain is that blockchains use nonces, which are random or semi-random numbers, and each transaction requires the additional block of data to be hashed. A nonce is a number that is used once and serves to prevent replay attacks within a blockchain. Replay attacks occur when an attacker intercepts communications taking place over a network and then retransmits those communications from their own system. As you can guess, this can significantly affect the security of a blockchain, so using nonces helps prevent this from being successful.

As mentioned, each transaction results in a new block of data that needs to be hashed. Hash functions come into play in various ways in the continuous loop that is the blockchain.

First, each block includes the value of the hashed header from the previous block. Before adding the new transaction, the header of the previous block is validated using that hash value. Like message and file integrity, the blockchain uses hash values to perform similar validation to ensure that previous blocks of data have not been tampered with.

Once validated, the new data block is added, along with a nonce, and the hash algorithm is applied to generate a new hash value. This process creates a repeated cycle of hashing that is used to protect the integrity of the transactions.

A diagram showing hashing in cryptography — Image: Shutterstock

More in cyber security8 Ways to Avoid 8 NFT Scams

Hashing Origin

The idea of hashing was introduced in the early 1950s by an IBM researcher, Hans Peter Luhn. Although Luhn did not invent today’s algorithms, his work eventually led to the first forms of hashing. His colleagues set him a challenge: They had to efficiently search a list of chemical compounds stored in a coded format. Luhn knew there had to be a way to improve information retrieval for cases like this, and so the process of indexing was born.

Over the next 30 years, scientists built on his invention of indexing to develop a way to codify plain text, known as hashing. Hashing requires two components: a plain text value and a hashing algorithm. Applying the algorithm against the plain text value results in a hashed output.

Why hashing is important

Hashing was and still is a valuable security mechanism to render data unreadable to the human eye, to prevent its interception by malicious individuals, and to provide a way to validate its integrity. Over the years, hashing algorithms have become more secure and advanced, making it difficult for bad actors to reverse hashed values. Although hashes will always be crackable, the complex mathematical operations behind them along with the use of salts and nonions make this less possible without massive amounts of computing power.

Disclaimer for Uncirculars, with a Touch of Personality:

While we love diving into the exciting world of crypto here at Uncirculars, remember that this post, and all our content, is purely for your information and exploration. Think of it as your crypto compass, pointing you in the right direction to do your own research and make informed decisions.

No legal, tax, investment, or financial advice should be inferred from these pixels. We’re not fortune tellers or stockbrokers, just passionate crypto enthusiasts sharing our knowledge.

And just like that rollercoaster ride in your favorite DeFi protocol, past performance isn’t a guarantee of future thrills. The value of crypto assets can be as unpredictable as a moon landing, so buckle up and do your due diligence before taking the plunge.

Ultimately, any crypto adventure you embark on is yours alone. We’re just happy to be your crypto companion, cheering you on from the sidelines (and maybe sharing some snacks along the way). So research, explore, and remember, with a little knowledge and a lot of curiosity, you can navigate the crypto cosmos like a pro!

UnCirculars – Cutting through the noise, delivering unbiased crypto news