Evaluating relative speed of java digest (hashing) algorithms

It’s best practice to encrypt security tokens such as passwords and sessions ids in your database. I was just doing that to the session tokens for a project at work, and I wondered which algorithm to pick if speed was the only consideration.

I cranked up some code that measures the wall-clock that it takes for each algorithm to hash a bunch of strings (10 millions). I found that MD5 is the fastest, and MD2 is the slowest, taking roughly twice the time. I would have gone for MD5 anyway as it is the standard choice, but it’s good to see that it’s quick too.

The usual disclaimers apply: this test is very un-scientific and is only relevant for my specific use case (the tokens I handle are the same length of the random strings in the test) so don’t plan your business on it. That’s pretty quick stuff anyway (15 to 30 microsecond each encryption), so you may want to pick an algorithms for its security features rather then for its execution speed. See the docs for more info.

Output:

Creating 10000000 random strings... Created.
Testing algo MD2...	Completed in 339126 milliseconds
Testing algo MD5...	Completed in 169690 milliseconds
Testing algo SHA-1...	Completed in 200398 milliseconds
Testing algo SHA-256...	Completed in 211560 milliseconds
Testing algo SHA-384...	Completed in 303999 milliseconds
Testing algo SHA-512...	Completed in 316265 milliseconds
Test Complete.

And code:

import java.math.BigInteger;
import java.security.MessageDigest;
import java.security.SecureRandom;

public class CompareHashFunctions {

	private static SecureRandom random = new SecureRandom();

	public static void main(String[] args) throws Exception {

		int runs = 10000000;				

		System.out.print("Creating " + runs + " random strings... ");
		String salt = randomString();
		String[] strings = new String[runs];
		for (int i=0; i<strings.length; i++) {
			strings[i] = randomString();
		}

		System.out.println("Created. ");	

		runTest(salt, strings, "MD2");
		runTest(salt, strings, "MD5");
		runTest(salt, strings, "SHA-1");
		runTest(salt, strings, "SHA-256");
		runTest(salt, strings, "SHA-384");
		runTest(salt, strings, "SHA-512");		

		System.out.println("Test Complete.");
	}

	static void runTest(String salt, String[] strings, String algo) throws Exception {
		System.out.print("Testing algo " + algo + "...\t");
		MessageDigest instance = MessageDigest.getInstance(algo);
		long start = System.nanoTime();
		for (int i=0; i<strings.length; i++) {
			byte[] bytes = (salt + strings[i]).getBytes("UTF-8");
			MessageDigest clone = (MessageDigest)instance.clone();
			new String(clone.digest(bytes));
		}
		long elapsed = (System.nanoTime() - start) / 1000000;
		System.out.println("Completed in " + elapsed +  " milliseconds ");
	}		

	/**
	 * Random string
	 * @return a random string of 25 or 26 chars
	 */
	static String randomString() {
		// 130 bit random integer converted to string in base 32
		return new BigInteger(130, random).toString(32);
	}
}
This entry was posted in Uncategorized. Bookmark the permalink.
  • http://www.blurbco.com/~gork/ John Laur

    Matteo,

    Hello!

    I wanted to drop in to mention (if only for the sake of others who might read this) that for the specific use case of hashing passwords or decrypting a shared key for a stream cypher (passprhase) or similar situations, a salt+hash is actually a poor thing to do. The unfortunate effects of 30 years of fallout caused by seeing UNIX crypt() protect our most sensitive root passwords has unfortunately led many web developers into the trap of believing that salt+good hash is a reasonable storage mechanism. Consider the password breaches caused by recent exploits at Gawker, Sony, et al. — all of this data stored salted and hashed passwords, but by the time news broke and actual data began leaking into the wild, most if not all the stored hashes had been broken.

    Hashing functions are designed with two goals: 1) be as fast as possible and 2) minimize hash collisions. Salting prevents a precomputed (rainbow table) attack, but when a single modern GPU can roll through billions to trillions of possibilities per second in a brute-force attack, the “strength” of the hash is all but irrelevant. “Better” hashes are only needed to reduce the likelyhood of hash collision or eliminate flaws or weaknesses in earlier algorithms.

    So it is in fact the speed of the algorithm which creates and verifies a hash, or rather the lack of it, that makes the most difference with regards to security. So unless one wants to perform a very deep dive into cryptographic theroy, the best approach is usually to code to a standard. The current most widely used method is to use PBKDF2 (RFC 2898) to produce and store a derived key instead of a simple hash. PBKDF2 allows the selection of a few parameters, mainly the salt length, the hash algorithm, and the number of iterations. I believe 64-bit salts, 2000 iterations, and HMAC_SHA256 are around the currently accepted minimums for “strong.” Of course the password requirements themselves must be decently designed to survive a dictionary attack or brute force attach which are still possible (though many orders of magnitude slower) .

    If one wants to experiment, there are some alternatives which can be equally good if not better such as bcrypt and scrypt. But as they have not been subjected to as widespread use or scrutiny, often they are not great choices for projects that need to be autited, understood by many people, or need to be somewhat future-proof or portable.

    No matter what, it is all very interesting reading!

  • Matteo Caprari

    Hi John!
    Oh boy, the things that one doesn’t know!
    Do you have a link to share for an acceptable way of implementing password hashing? I’d like to extend the post with some timings.

    Thanks for commenting :)

  • http://caprazzi.net/posts/java-bytecode-string-concatenation-and-stringbuilder/ Java bytecode, string concatenation and StringBuilder | matteo caprari

    [...] matteo caprari Skip to content Home ← Evaluating relative speed of java digest (hashing) algorithms [...]

  • http://www.blurbco.com/~gork/ John Laur

    Personally I recommend using PBKDF2 with HMAC_SHA512, 10000 iterations, and 8 bytes of salt. There are various libraries for PBKDF2, but most do not allow you to choose a hash other than HMAC_SHA1 (it is both the default and the best choice of the algos in the official standard RFC2898; using HMAC_SHA512 is technically bending the rules). http://www.ietf.org/rfc/rfc2898.txt contains the standard. The only real tuning is the # of iterations which should be as high as your application can tolerate without ever going under 4000 or so.

    So if you are going for interoperability you can still go with HMAC_SHA1 and feel pretty good about it. This is the algorithm that is used to generate DK’s for WPA2, so until you hear WPA2 is broken PBKDF2 with HMAC_SHA1 is still a safe bet.

    To find an implementation example, just google for PBKDF2 + Your Language of Choice. I found a java one here http://www.rtner.de/software/PBKDF2.html and there is a javascript one here: http://anandam.name/pbkdf2/pbkdf2.js.txt if that is your use case.

    Or there are OpenSSL bindings for most languages, and OpenSSL has the function PKCS5_PBKDF2_HMAC_SHA1

    Most of these can be adapted to use a different hashing algorithm. Be sure you are not truncating the output of the hash function if possible as PBKDF2 allows for it to be of arbitrary length. HMAC_SHA1 is 20 bytes; HMAC_SHA512 is 64. Store the PBKDF2 parameters with the salt and hash – obscurity here is unnecessary. I use a format like:

    $PBKDF2$HMACSHA512:10000:mV8wcrQOkeg=$od7tWGZOH4JxwNyXFDPY4DgSYmTbtWO5zgoX1LgufC036F/OuRethvnJ8PaJP56pqUCXV1ClBPHt/fCMroaE1A==

    $$hashfunction:iterations:base64(salt)$base64(hash)

    The above is the resultant enoding for the DK derived from “toomanysecrets” with a full 64 byte hash. Any hash of this format can be stored in a char(128) or similar. This hash takes about 0.15s of CPU time to compute with the implementation I used on decent hardware. This is about 6 hashes/second. By comparison your straight SHA512 was operating at about 32,000 hashes/second.