hashes.md5

The MD5 algorithm is a hash function that’s commonly used as a checksum to detect data corruption. The algorithm works by processing a given message in blocks of 512 bits, padding the message as needed. It uses the blocks to operate a 128-bit state and performs a total of 64 such operations. Note that all values are little-endian, so inputs are converted as needed.

Although MD5 was used as a cryptographic hash function in the past, it’s since been cracked, so it shouldn’t be used for security purposes.

For more info, see https://en.wikipedia.org/wiki/MD5

Functions

get_block_words(→ collections.abc.Generator[list[int]])

Splits bit string into blocks of 512 chars and yields each block as a list

left_rotate_32(→ int)

Rotate the bits of a given int left by a given amount.

md5_me(→ bytes)

Returns the 32-char MD5 hash of a given message.

not_32(→ int)

Perform bitwise NOT on given int.

preprocess(→ bytes)

Preprocesses the message string:

reformat_hex(→ bytes)

Converts the given non-negative integer to hex string.

sum_32(→ int)

Add two numbers as 32-bit ints.

to_little_endian(→ bytes)

Converts the given string to little-endian in groups of 8 chars.

Module Contents

hashes.md5.get_block_words(bit_string: bytes) collections.abc.Generator[list[int]]

Splits bit string into blocks of 512 chars and yields each block as a list of 32-bit words

Example: Suppose the input is the following:
bit_string =

“000000000…0” + # 0x00 (32 bits, padded to the right) “000000010…0” + # 0x01 (32 bits, padded to the right) “000000100…0” + # 0x02 (32 bits, padded to the right) “000000110…0” + # 0x03 (32 bits, padded to the right) … “000011110…0” # 0x0a (32 bits, padded to the right)

Then len(bit_string) == 512, so there’ll be 1 block. The block is split into 32-bit words, and each word is converted to little endian. The first word is interpreted as 0 in decimal, the second word is interpreted as 1 in decimal, etc.

Thus, block_words == [[0, 1, 2, 3, …, 15]].

Arguments:

bit_string {[string]} – [bit string with multiple of 512 as length]

Raises:

ValueError – [length of bit string isn’t multiple of 512]

Yields:

a list of 16 32-bit words

>>> test_string = ("".join(format(n << 24, "032b") for n in range(16))
...                  .encode("utf-8"))
>>> list(get_block_words(test_string))
[[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15]]
>>> list(get_block_words(test_string * 4)) == [list(range(16))] * 4
True
>>> list(get_block_words(b"1" * 512)) == [[4294967295] * 16]
True
>>> list(get_block_words(b""))
[]
>>> list(get_block_words(b"1111"))
Traceback (most recent call last):
...
ValueError: Input must have length that's a multiple of 512
hashes.md5.left_rotate_32(i: int, shift: int) int

Rotate the bits of a given int left by a given amount.

Arguments:

i {[int]} – [given int] shift {[int]} – [shift amount]

Raises:

ValueError – [either given int or shift is negative]

Returns:

i rotated to the left by shift bits

>>> left_rotate_32(1234, 1)
2468
>>> left_rotate_32(1111, 4)
17776
>>> left_rotate_32(2147483648, 1)
1
>>> left_rotate_32(2147483648, 3)
4
>>> left_rotate_32(4294967295, 4)
4294967295
>>> left_rotate_32(1234, 0)
1234
>>> left_rotate_32(0, 0)
0
>>> left_rotate_32(-1, 0)
Traceback (most recent call last):
...
ValueError: Input must be non-negative
>>> left_rotate_32(0, -1)
Traceback (most recent call last):
...
ValueError: Shift must be non-negative
hashes.md5.md5_me(message: bytes) bytes

Returns the 32-char MD5 hash of a given message.

Reference: https://en.wikipedia.org/wiki/MD5#Algorithm

Arguments:

message {[string]} – [message]

Returns:

32-char MD5 hash string

>>> md5_me(b"")
b'd41d8cd98f00b204e9800998ecf8427e'
>>> md5_me(b"The quick brown fox jumps over the lazy dog")
b'9e107d9d372bb6826bd81d3542a419d6'
>>> md5_me(b"The quick brown fox jumps over the lazy dog.")
b'e4d909c290d0fb1ca068ffaddf22cbd0'
>>> import hashlib
>>> from string import ascii_letters
>>> msgs = [b"", ascii_letters.encode("utf-8"), "Üñîçø∂é".encode("utf-8"),
...         b"The quick brown fox jumps over the lazy dog."]
>>> all(md5_me(msg) == hashlib.md5(msg).hexdigest().encode("utf-8") for msg in msgs)
True
hashes.md5.not_32(i: int) int

Perform bitwise NOT on given int.

Arguments:

i {[int]} – [given int]

Raises:

ValueError – [input is negative]

Returns:

Result of bitwise NOT on i

>>> not_32(34)
4294967261
>>> not_32(1234)
4294966061
>>> not_32(4294966061)
1234
>>> not_32(0)
4294967295
>>> not_32(1)
4294967294
>>> not_32(-1)
Traceback (most recent call last):
...
ValueError: Input must be non-negative
hashes.md5.preprocess(message: bytes) bytes

Preprocesses the message string: - Convert message to bit string - Pad bit string to a multiple of 512 chars:

  • Append a 1

  • Append 0’s until length = 448 (mod 512)

  • Append length of original message (64 chars)

Example: Suppose the input is the following:

message = “a”

The message bit string is “01100001”, which is 8 bits long. Thus, the bit string needs 439 bits of padding so that (bit_string + “1” + padding) = 448 (mod 512). The message length is “000010000…0” in 64-bit little-endian binary. The combined bit string is then 512 bits long.

Arguments:

message {[string]} – [message string]

Returns:

processed bit string padded to a multiple of 512 chars

>>> preprocess(b"a") == (b"01100001" + b"1" +
...                     (b"0" * 439) + b"00001000" + (b"0" * 56))
True
>>> preprocess(b"") == b"1" + (b"0" * 447) + (b"0" * 64)
True
hashes.md5.reformat_hex(i: int) bytes

Converts the given non-negative integer to hex string.

Example: Suppose the input is the following:

i = 1234

The input is 0x000004d2 in hex, so the little-endian hex string is “d2040000”.

Arguments:

i {[int]} – [integer]

Raises:

ValueError – [input is negative]

Returns:

8-char little-endian hex string

>>> reformat_hex(1234)
b'd2040000'
>>> reformat_hex(666)
b'9a020000'
>>> reformat_hex(0)
b'00000000'
>>> reformat_hex(1234567890)
b'd2029649'
>>> reformat_hex(1234567890987654321)
b'b11c6cb1'
>>> reformat_hex(-1)
Traceback (most recent call last):
...
ValueError: Input must be non-negative
hashes.md5.sum_32(a: int, b: int) int

Add two numbers as 32-bit ints.

Arguments:

a {[int]} – [first given int] b {[int]} – [second given int]

Returns:

(a + b) as an unsigned 32-bit int

>>> sum_32(1, 1)
2
>>> sum_32(2, 3)
5
>>> sum_32(0, 0)
0
>>> sum_32(-1, -1)
4294967294
>>> sum_32(4294967295, 1)
0
hashes.md5.to_little_endian(string_32: bytes) bytes

Converts the given string to little-endian in groups of 8 chars.

Arguments:

string_32 {[string]} – [32-char string]

Raises:

ValueError – [input is not 32 char]

Returns:

32-char little-endian string

>>> to_little_endian(b'1234567890abcdfghijklmnopqrstuvw')
b'pqrstuvwhijklmno90abcdfg12345678'
>>> to_little_endian(b'1234567890')
Traceback (most recent call last):
...
ValueError: Input must be of length 32