Find all subset of string in dictionary – code

Find all subset of string in dictionary is to find the subset of an input string that exists in dictionary. The dictionary contains one million words. For this question, the trick is to find subset, not substring. What’s the difference between substring and subset? The substring is contiguous sequence of characters in the string. While in subset, characters are not ordered nor continuous.

The intuition is to get the combinations of the characters in the input string. Then check whether they exist in the dictionary. The time complexity of combinations is O(n!). If the string has 30 characters. The Time complexity is e32.

How can we do better? One hint from interviewer is to loop through each word in dictionary. Because one million is better than combinations. Another technique we can apply is using sorting (not given by interviewer)! Sorting is very efficient in solving string problems. “Find first non repeated character”, “Group Anagrams” are the examples.

First we sort the characters in the input string. Then we loop through each words in dictionary. We sort the word, and check whether the sorted word is substring in the sorted string. Now let’s take a look at time complexity of this solution. Sort string of 30 characters is O(s^2), which is 900. It can be ignored compared to a million. O(e6) is still much better than O(e32)!

Facebook Interview Question:
Find all words [A-Z] in a dictionary (about 1M words) that are made of a subset (in any order) of the chars in the input parameter [A-Z].
ex: input “ACRAT” (10 to 20 chars, up to 30 worst case)
matching words: “A”, “CAR”, “ACA”, “CART”, “RAC”
non-matching words: “BAR”, “AAA”
The dictionary doesn’t change but the function will be called extensively.

Java Code:

Output:
[A, CAR, RAC, ACA, CART]

O Notation:
Time complexity: O(d)
Space complexity: O(d)
d is dictionary length (1 million)


Download FindAllSubsetWords.java
Java Coding Interview Pocket Book (2nd Edition)

Comments are closed