Monday, July 24, 2017

Boggle - find words in matrix of letters

July 24, 2017

Plan to study the algorithm Boggle coding blog. The blog link is here written by Aviad Ezra.

There are a lot of thing for Julia to learn through those blog and code sample. Let us read words from Aviad Ezra together in the following:

Learn: Cracking the interview, solve all 189 questions.

Practice: Nothing beats mock interviews. It will boost your confidence and you’ll learn a ton from having someone watching you and listening to your explanations while you solve coding problems. You can pair with a friend or use one of the free peer-to-peer mock interviewing platforms. You don’t need to sacrifice your first interviews just to get hands-on practice.

Think fast: Developers that can think on their feet do much better in coding interviews. Practice in as many competitive programming sessions as you can.

Before you lost hope, keep in mind that there's a finite number of algorithms topics and only a handful of data structures that are used in the interviews.

Those words from  Aviad Ezra are so encouraging, Julia wrote her own blog about "Can mocking make difference?". First time she learns to work with people, share with people, learn from people on small things, like presenting her face in the centre through the video, ask good questions, give good hint, form a quick friendship etc.

In the following, Julia uses Leetcode 212 to help herself review Trie data structure and learn how to apply same algorithm as Boggle classified in Leetcode.



Leetcode 212 - word search II


Review Leetcode 212 last practice using Trie. The link is here.

It is so surprising that learning Trie and practice depth first search is most fun activities in the summer of 2017. Julia has to warm up the algorithm, play with one test case with Trie class, and also make the code more readable. Here is the warmup C# code using Trie data structure, recursive function calls. The solution is optimal and also pass online judge.

It is also a good idea to discuss why Trie is needed for better time complexity.

Brute force solution is to go over each word in the dictionary, and then try to find it in the matrix. The total number of search using DFS is m (words) * rows (matrix) * columns (matrix).

The brute force solution has timeout issue through Leetcode online judge. We like to review why the time complexity should be better.

Let us go over dictionary with 4 words, "aaa", "aaaa","aaab", "aaac". If we pre-process the dictionary and store all words in a Trie, and then we do not have to go over each words to search matrix. We only need to go over each element in matrix as start char in a word, search the trie to find match words using recursive function. The number of search using DFS is rows (matrix) * columns (matrix).

Second advantage related to the above 4 words ( "aaa", "aaaa","aaab", "aaac") is taking advantage of Trie data structure. The space complexity of Trie is better compared to hashset or hashtable. The same prefix "aaa" is only repeated once in the Trie.

This hard level algorithm applying Trie will be Julia's most favorite algorithm in the month of July 2017.

One of practice in 2017 is prefix neighbor on hackerrank, Julia asked the code review on stackexchange.com.


July 26, 2017

Trie against Hashset


Related to the test with a dictionary "aaa","aaaa","aaab","aaac","aaad", let us work together to talk about the difference.

For example, if the dictionary is saved in hashset, then go over each word in the dictionary, try to find word in matrix.
For example, "aaab", the first three letters has to be compared and they are the same first, then the last letter will be checked. Same will be applied to another 4 words. In total, the prefix "aaa" will be compared exactly 5 times in order to find those 5 words.

Can we do better? Just compare the prefix "aaa" once? Cetainly we can. We can save the words in a trie instead of hashset.

a
|
a
|
a
|\  \  \
| \  \  \
|  \  \  \
a  b  c  d



Trie efficiency talk 



It takes some time to get comfortable to design a Trie. So far, Julia has written a Trie implementation more than 4 times in C# last 3 years. Given the fact that she has worked on computer science more than 20 years, if she writes a algorithm every day, then she will write 60,000 algorithms. It is smart to focus on one simple thing, write code for algorithm and data structure, no matter life goes up and down, make it a hobby.

How to design a Trie so that the space complexity is minimum?
For example, the dictionary has the following words, "aaa","aaaa","aaaaa","aaaaaa".
How to store them in Trie efficiently?
a
|
a
|
a   word: "aaa"
|
a   word: "aaaa"
|
a   word: "aaaaa"
|
a   word: "aaaaaa"

The above diagram shows that those four words are saved in a Trie. How many char 'a' are saved in Trie, only 6, not 3 + 4 + 5 + 6 = 18 chars. Four words are saved along the trie nodes.

Every node can represent a word, it does not have to be a leaf node.

It is a smart decision to write down some test cases, so next time when Julia comes cross an algorithm, she can quickly relate to the test cases and figure out the advantage of using Trie.

No comments:

Post a Comment