Find all subset of string in dictionary – code

Find all subset of string in dictionary is to find the subset of an input string that exists in dictionary. The dictionary contains one million words. For this question, the trick is to find subset, not substring. The difference between substring and subset is: The substring is contiguous sequence of characters in the string. While in subset, characters are not ordered nor continuous.

The intuition is to get the combinations of the characters in the input string. Then check whether they exist in the dictionary. The time complexity of combinations is O(n!). If the string has 30 characters. The Time complexity is O(e^32). To improve, a hint (from interviewer) is to loop through each word in dictionary. Because one million is better than combinations.

To solve the problem, sorting plays the key role. We sort the characters in the string, and characters of the words in the dictionary. If the substring of the word exists in the sorted input string, that means the subset exists in the dictionary. The time complexity of the solution is O(d), d is the number of words in dictionary, so it is O(e^6). Sorting string of 30 characters is O(slogn) ~ O(s^2) (depend what methods you use do sort ). It is small compared to a million. It can be ignored. The time complexity O(e^6) is much better than O(e^32)!

Facebook Interview Question:
Find all words [A-Z] in a dictionary (about 1M words) that are made of a subset (in any order) of the chars in the input parameter [A-Z].
ex: input “ACRAT” (10 to 20 chars, up to 30 worst case)
matching words: “A”, “CAR”, “ACA”, “CART”, “RAC”
non-matching words: “BAR”, “AAA”
The dictionary doesn’t change but the function will be called extensively.

Java

JavaScript

Python

Output:
[A, CAR, RAC, ACA, CART]

O Notation:
Time complexity: O(d)
Space complexity: O(d)
d is dictionary length (1 million)


Download FindAllSubsetWords.java
Download FindAllSubsetWords.js
Download FindAllSubsetWords.py

Comments are closed