Huffman coding and compression – Code

What is Huffman code? Huffman code is a particular type of optimal prefix code. It is commonly used for lossless data compression. The explanation of Huffman coding and compression can be found in wiki.

How to implement Huffman coding and compression? We use frequency-sorted binary tree to encode symbols. Here are the steps:
1. Build a map of (character, frequency) pairs from the input String
2. Build PriorityQueue sorted by the frequency
3. Build a tree based on the sequence of elements in Priority Queue
4. create Huffman code map from the tree.
5. Apply Huffman code to compress message

Amazon Interview Question (CareerCup):
Please note the original question statement is not clear. The question has been modified to make sense here.
Question 1: Implement the Huffman coding as shown in the example below:
1) Given a string AAAAAABBCCDDEEFFFFF, group them according to the number of occurrences: A => 6, B => 2, C => 2, D => 2, E => 2, F => 5
2) Put the string in a tree-like structure:

3. Output the Huffman code.
Java Code:

Output:
{A=11, B=000, C=011, D=001, E=010, F=10}

O Notation:
Time complexity: O(n)
Space complexity: O(n)

Question 2: Implement the Huffman compression algorithm.

Java code:

Output:
{A=10, B=1111, C=1110, D=00, E=110, _=01}
1001001101000010010000111101100011000011001111110000111111011111100110011111110100011000011011111011101001111111000

O Notation:
Time complexity: O(n)
Space complexity: O(n)

Action Items:
Download HuffmanCoding.zip
Merkle-Hellman Knapsack Cryptosystem at github
The complete list of coding interview questions

Comments are closed