27 May 2012

What are MD5 and SHA-1 Hashes



You may have seen MD5 hashes listed next to downloads during your internet travels, but what exactly are they? Let’s take a look at what these cryptic strings are and how you can use them to verify your downloads.

clip_image002

 

What Are Hashes and What Are They Used For?

clip_image004

 

Hashes, “digests,” are the products of cryptographic algorithms(in short, they’re a set of instructions used by computers to manipulate data). Many hash functions are designed to produce a fixed-length digest, regardless of the size of the input data. Take a look at the above chart and you’ll see that both “Fox” and “The red fox jumps over the blue dog” yield the same length output.

 

Another factor is complexity. Compare the second example in the above chart to the third, fourth, and fifth. You’ll see that despite a very minor change in the input data, the resulting hashes are all very different from one another. This is a sign of complexity of the algorithm (at least to our non-programmer eyes) and helps make it so that working backwards from the hash to the data is very difficult. Passwords are often stored as hashes because of this reason; it’s easy to take the password during a login attempt and compare it to the stored hash. On the the other hand, if someone has the hash, it’s very difficult to work backwards to the original input. When people try to crack passwords, they usually don’t work backwards, but instead use a dictionary of known hashes (usually of common passwords and key patterns) to compare the stolen ones with.

 

Data Verification

 

clip_image006

MD5, the Message-Digest Algorithm, has been used in multiple types of security-based programs in the past, but it’s also widely employed for another purpose: data verification. These types of algorithms work great to verify your downloads. Imagine, if you will, you’re online trying to grab the latest Ubuntu release from BitTorrent. Some horrible troublemaker starts distributing a version of the .iso you need but with malicious code embedded into it. Not just that, he’s clever, so he makes sure the files are exactly the same size. You would’t know you had the bad file until you tried to boot the CD, and by then, permanent damage could have already occurred!

clip_image008

 

You can run a hash check yourself with any number of tools, and then check it against the posted checksum. If there are any differences at all, you know that the file you have was tampered with, did not complete properly, or something else prevented the data from matching. This way you prevent any damage to your system before you run anything, and you can just re-download the appropriate file.

This comes in handy not just for Linux distros, but for other things like BIOS files, third-party Android ROMs, and router firmwares – all things that could potentially “brick” your devices if the data is tainted. In general, large files have a larger risk of data corruption, so you may want to run your own checksums if your archives are important.

MD5 is no longer considered completely secure, and so people have started to migrate to other commonly used hash algorithms like SHA-1. This last one in particular is used for data verification more and more often so most tools will work with both of these algorithms. SHA1 is a cryptographic hash function. It's result is usually expressed as a 160 bit hex number. SHA1 was developed by the NSA. SHA1 is widely considered the successor to MD5. SHA-1 is the most widely used of the existing SHA hash functions, and is employed in several widely used applications and protocols.

ABOUT THE AUTHOR
AbhiShek SinGh
Founder of 'TheHackingArticles'. Cyber Security Analyst, Cyber Security Researcher, and Software Engineer. Follow 'AbhiShek SinGh' on Facebook , Twitter or Google+ or via Email

Subscribe to stay up to date