New secure protocol using 512 bit hashes for encryption/decr

# New secure protocol using 512 bit hashes for encryption/decryption #
By Michael J
7 September 2013
v1.0

Suppose two parties want to communicate securely with each other (Bob and Alice) using a simple messaging system.

# The protocol #

1.    Bob creates a random private key of 512 bits in length e.g. '0a3042ea1ebe234e375f5fe2beecf8149be92274b09b7e2a05820fea6648346a15c970cb477eb854bee91bfbe3459b0c404dff1d67e8a29d1cd83028e56c91ef'

2.    Bob also chooses a random hash function from a group of cryptographic hash functions that all output 512 bits. These could be SHA-2, SHA-3, Grøstl, BLAKE, JH, Whirlpool and Skein hash functions. There are more 512 bit hashes that were discarded as part of the recent NIST competition, not that they had any real flaws, so they could be used as well. More implementation work however.

3.    Bob gives the key and chosen hashing method to Alice (either in person or using some other key exchange method).

4.    Using a program, Bob writes a message to Alice. The plain text message is converted to hexadecimal format (i.e. symbols '0-9' & 'a-f' for a total of 16 symbols). This is then split up into the separate hexadecimal symbols e.g. 'a','f','0','b','2','c','3' etc.

5.    A random nonce is created for each message. This will be 512 bits and sent with the message e.g.
'b71cf88793f5bb3f067d69d36b2cf28aab3cf7dc1bb01b8c42ceb5a17ff30da7e34d6be1e43b38ff76c9099381fd386296455b22e9b139d75789d842e3a85473'. Really a simple message counter would be fine, as that will change the output of the hash, however it would reveal how many messages have been sent so far, therefore a random nonce is better.

6.    A counter is prepended to each hexadecimal symbol in the message. This would start at 0 and increase by 1 for each hexadecimal symbol through to the last hexadecimal symbol in the message (i=0 ... n). For example, '0a','1f','20','3b','42','5c','63'. This is used so that each hexadecimal symbol is encrypted differently in the message. One of the properties of a hash function is that even a small change in the input will change the output dramatically.

7.    To encrypt the message, each hexadecimal symbol in the message is encrypted separately. Bob computes HMAC(key, message). The breakdown of the 'message' will be:

    'message nonce || counter || message hex symbol'. The || indicates concatenation. So effectively for each hex symbol, Bob is computing:

    'HMAC(key, nonce || counter || message hex symbol)'. This will produce 512 bits of output for each hexadecimal symbol in the message. Encryption speed will depend on the size of the message, but will be very fast in comparison to decryption.

8.    All the encrypted 512 bit outputs are concatenated together and this forms the ciphertext. Each encrypted hex symbol in the message will have 512 bits of output.

9.    A message authentication code is computed on the text using HMAC(key, nonce || ciphertext) and sent with the message e.g. 'nonce || ciphertext || MAC'. The message should now be indistinguishable from random data.

10.    Alice receives the encrypted message. Grabs the last 512 bits of the message which is the MAC. Then validates the MAC with her copy of the key and the hash algorithm using HMAC(key, nonce || ciphertext). Incorrect MACs mean the message is discarded immediately.

11.    Alice's program removes the first 512 bits of the sent message which is the nonce and stores it in a variable. The message ciphertext is split up into the separate hashed outputs for each hexadecimal symbol of the message. It does this because it knows each hex symbol was hashed into a fixed length 512 bit hash. Output would be 'hash','hash','hash' ...

12.    For decryption, Alice gets the first 512 bit hash output and tries out each hexadecimal symbol '0-9, a-f' to see if she can get a match with the encrypted hash. In other words she computes 'HMAC(key, message nonce || counter || message hex symbol)'. Once a match is found, this means that the message hex symbol is part of the message, so she keeps that and continues with the next hash so she can keep rebuilding the plaintext. The next hash will increment the counter by 1 and try to find a match with the output again. Once all hexadecimal symbols have been recovered they can be converted back to plaintext. Worst case scenario she has to try 16 times to get a match for each hex symbol. Best case is she finds the match on first try. The checking for each hash should break once a match is found as there is no need to try all 16 hex symbols if a match is found early. This may speed up the decryption process.

It is unfeasible for an attacker to try checking the hashes using the same approach because they don't have the 512 bit key. Also the nonce changes per message so even if the same plaintext was sent twice, the ciphertext would be completely different because of the nonce. Also each hexadecimal symbol in the message will be encrypted differently due to the counter as well, preventing frequency analysis.

#Performance analysis:#

I have done some calculations based on a Intel Core i5-2500 @3.3 Ghz which has 4 CPU cores. The time to create a HMAC is 25 milliseconds using one CPU core. Using multiple cores it would be possible to split up the load and encrypt/decrypt parts of the message in parallel. Based on a message size of 100 plaintext characters (200 hexadecimal symbols) I calculated the following:

Encryption:

    25 milliseconds per hash x 200 hex symbols
    = 4400 milliseconds
    = 4.4 seconds for 1 CPU core
    = 1.1 seconds for 4 CPU cores

Decryption:

    16 hex symbols x 25 milliseconds per hash x 200 hex symbols
    = 80000 milliseconds
    = 80 seconds for 1 CPU core
    = 20 seconds for 4 CPU cores

Using a Core i7 with Hyper-Threading or an Extreme Edition 6 core processor the time to decrypt would be reduced even further, perhaps to 13 seconds or lower. Using multiple server CPU chips, even faster. Some of the newer hashing algorithms also make use of the AES instructions on newer Intel chips which would dramatically increase the speed. Shorter messages are also a lot faster to encrypt and decrypt. Longer messages will be slower to encrypt and decrypt. Even at 20 seconds for decryption it's still feasible to use it for some kind of delayed messaging, e.g. similar to email where you don't need to read a message instantly. With Moore's Law and increasing CPU speeds, this scheme could be much faster in a few years while still remaining completely infeasible to brute force attacks for powerful attackers.

File size:

    200 hex symbols * 512 bits
    = 102400 bits
    = 12800 Bytes
    = 12.5 Kilobytes

As you can see the encrypted output is much longer than the message itself. But this is not really a large amount considering today's networking and storage capabilities. The ciphertext can be discarded once decrypted anyway.

#Security features:#

 - The strength of the protocol relies on the difficulty in computing a pre-image of the HMAC hash of one of the hashed words to find the key. Alternatively an attacker could brute force attempt all the key combinations to get a match. This would be $2^{512}$ for a standard computer using a brute force search or $2^{512}$ using Grover's algorithm on a quantum computer which is quite unfeasible for even the most powerful attackers. HMAC appears to be a secure construct, no pre-image attacks have been found for even the weakest hashes by today's standards HMAC-MD5 or HMAC-SHA1.

 - There is also security in that the attacker does not know which hash function that was used by either party. They will have to try brute forcing using all available 512 bit hash algorithms. This adds a further 8 bits of security (2<sup>3</sup>). It also discourages an attacker from trying to construct a rainbow table for all these algorithms. The 512bit key size makes calculating and storing all possible hash outputs infeasible even with supercomputers.

 - There's no actual 'decryption' going on. It's simply comparing the results of a one-way hash to another. Without the 512 bit key they can't compute the same hash.

 - If the nonce, ciphertext and MAC are concatenated as a single string and sent, then the output is indistinguishable from a random string.

 - The same key can be used for multiple messages, or given to multiple parties to communicate.

 - Potentially any data including binary could be sent within the protocol if it's converted to hexadecimal first.

#Licence:#
Algorithm/protocol is public domain.