C# Trie design challenge and DFS algorithm practice in June 2018
June 16, 2020
It is challenge task to write the depth first search algorithm using Trie data structure. It took me several hours to make it work. I failed three test cases, and then I like to write down my lessons here.
First it is the design of depth first search. The checking of our-of-index-range of the matrix, visited node, child node is not in Trie data structure object are those three important checkings for a complete test.
Second one is to understand Trie data structure. Trie is a recursive data structure, so it is important to identify current node matrix[row][col] with function argument TrieNode parentNode relationship. The design is to pass parentNode which is not null, so the child node may not be in Trie data structure object.
Third one is to write TrieNode Build function to construct a Trie tree object. It is not easy task, so please work on "a", one char word, and it should be two level, parent node is dummy head, and dummy head's child is 'a' instead.
What else can I write down to make this TrieNode class crafting more error-prone?
- I change for(int i = 0; i < words.Length; i++) to foreach loop to make code clean, no need to use i to do any checking.
- Dummy head is important, and also understand that the call is passed using parent node;
- Statement: public TrieNode[] nodes = new TrieNode[26]; each nodes[i] is still null pointer.
- Remove "iterate.word = word;" outside statement "foreach (char c in word)" in TrieNode class API defintion.
- GetByIndex API of TrieNode should check pointer null first.
Template of DFS
- Design a DFS function to search a matrix, please understand DFS and also backtracking.
- Function name runDFS is good enough, TrieNode parentNode argument should be clearly defined, since current char may not be in Trie object.
- Put all edge cases in one statement, do not define explanation variable "var visit = board[row][col]" before checking index range and also board's content has not been visited yet.
- One word in dictionary can be a prefix in another word in dictionary, so it should continue to search. Termination should not put after the word is added to found string.
- Think about time complexity using Trie compared to HashSet, and practice more often Trie to solve problems.
- Discuss termination of DFS with interviewer or think about more test cases. It is safe to continue to search if a word is found, let it fall through the checking of index-out-range or char is not in the tree.
Template of TrieNode definition of Build
- Trie is a recursive data structure, tree data structure;
- Trie should start from a dummy hand for easy reference;
- Child node is one of element of TrieNode[26] array
- Child node is still null pointer even TrieNode[26] is called using statement " ... = new TrieNode();"
- Clean up TrieNode Build API
Special case: binary tree
To expedite the process to craft a TrieNode class Build API, treat special case a binary tree and see how to make it work; left and right child map to index 0 and 1. If left child exist, then do nothing except going to left child for next iteration. If left child does not exist, then create one first. Go over each word in dictionary, build a binary tree to include all words, assuming that all chars are 'a' or 'b' in those words. Try to go over the following test case 3, build a trie by hand first, and then talk about how to write a trie version with size of 26 array.
For each word in dictionary, go through char by char, and then start from dummy head of Trie object, and then if child node is created then continue, otherwise create the tree object.
Test case Trie
I failed the test case 3, so I like to share this test case.
One word in dictionary can be a prefix in another word in dictionary, so it should continue to search. Depth first search algorithm's termination should not put after the word is added to found string.
In the above test case, "aaa" is the prefix of "aaaa".
Practice talk
I could not perform very well on Amazon code screen this March. The reason is that debugger is disabled, and my design is not relax to allow mistake happen. I write something to remind myself, focus on design when debugging, think about modifying design to make it more robust.
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Threading.Tasks;
namespace _212_word_search_warmup_June_16_20
{
class Program
{
static void Main(string[] args)
{
//RunTestcase1();
//RunTestcase2();
RunTestcase3();
}
public static void RunTestcase1()
{
var board = new char[4][];
board[0] = "oaan".ToCharArray();
board[1] = "etae".ToCharArray();
board[2] = "ihkr".ToCharArray();
board[3] = "iflv".ToCharArray();
var found = FindWords(board, new string[] { "oath", "pea", "eat", "rain" });
}
public static void RunTestcase2()
{
var board = new char[1][];
board[0] = "a".ToCharArray();
var found = FindWords(board, new string[] { "b"});
}
public static void RunTestcase3()
{
//[["a","b"],["a","a"]]
// ["aba","baa","bab","aaab","aaa","aaaa","aaba"]
var board = new char[2][];
board[0] = "ab".ToCharArray();
board[1] = "aa".ToCharArray();
var found = FindWords(board, new string[] { "aba", "baa", "bab", "aaab", "aaa", "aaaa", "aaba" });
// ["aba","aaa","baa","aaba"]
// ["aaa","aaab","aaba","aba","baa"] expected - mine is missing: "aaab"
}
/// <summary>
/// design Trie class
/// </summary>
class TrieNode
{
public TrieNode[] nodes = new TrieNode[26];
private string word;
/// <summary>
/// Design Trie class
/// Please use the following test cases:
/// ["a"]
/// </summary>
/// <param name="words"></param>
/// <returns></returns>
public TrieNode Build(string[] words)
{
if (words == null)
{
return new TrieNode();
}
var original = new TrieNode(); // dummy head
var length = words.Length; // ["oath",""]
foreach (var word in words)
{
var iterate = original; // start from root char again
foreach (char c in word)
{
int index = c - 'a';
if (iterate.nodes[index] == null) // caught by online judge
{
iterate.nodes[index] = new TrieNode();
}
// go to next trie node
iterate = iterate.nodes[index];
}
iterate.word = word;
}
return original;
}
public bool IsWord()
{
return word != null;
}
public string GetWord()
{
return word;
}
public bool IsUnitialized()
{
return nodes == null;
}
public void Initialize()
{
nodes = new TrieNode[26];
}
public TrieNode GetByIndex(int index)
{
if (nodes[index] == null)
return null;
return nodes[index];
}
}
/// <summary>
/// Code review on June 16, 2020
/// warmup practice
/// 1. Design backtracking, walk through the example "oath" to explain to myself
/// how 'o','a','t','h' is marked as '#', after the word "oath" is added to the hashset,
/// all matrix's positions have original chars back.
/// 2. I did not fully understand how to apply backtracking back in June 2019, so I wrote my experience
/// on Lowest common ancestor showcase on Leetocode discuss
/// 3. This is perfect time for me to master the backtracking in DFS.
/// Facts:
/// Sometimes one word is another word's prefix in dictionary.
/// So do not stop search if the word is added to found hashset.
/// </summary>
/// <param name="board"></param>
/// <param name="words"></param>
/// <returns></returns>
public static IList<string> FindWords(char[][] board, string[] words)
{
if (board == null || words == null)
{
return new List<string>();
}
// build a trie
var trie = new TrieNode();
trie = trie.Build(words); // caught by debugger
var rows = board.Length;
var columns = board[0].Length;
var set = new HashSet<string>();
for (int row = 0; row < rows; row++)
{
for (int col = 0; col < columns; col++)
{
runDFS(board, row, col, trie, set);
}
}
return set.ToList();
}
/// <summary>
/// Design requirement:
/// TrieNode
/// Make sure that parentNode is not null all the time
/// current node is one of parentNode's child
/// </summary>
/// <param name="board"></param>
/// <param name="words"></param>
/// <param name="found"></param>
private static void runDFS(char[][] board, int row, int col, TrieNode parentNode, HashSet<string> found)
{
var rows = board.Length;
var columns = board[0].Length;
var visitedChar = '#';
// out of range or visited already
// or current node trie[visit - 'a'] is not in Trie
if (row < 0 || row >= rows || col < 0 || col >= columns ||
board[row][col] == visitedChar ||
parentNode.GetByIndex(board[row][col]-'a') == null
)
{
return;
}
var visit = board[row][col];
var child = parentNode.GetByIndex(board[row][col] - 'a');
// work on trie
if (child.IsWord())
{
found.Add(child.GetWord());
//return; // June 16, 2020 case 3: sometimes a word is another word's prefix in dictionary. So continue to search.
}
// Design DFS with backtracking
var current = board[row][col];
// mark visited
board[row][col] = visitedChar;
// Four directions - clockwise, top, right, down, left
runDFS(board, row - 1, col, child, found);
runDFS(board, row, col + 1, child, found);
runDFS(board, row + 1, col, child, found);
runDFS(board, row, col - 1, child, found);
//backtracking
board[row][col] = current;
}
}
}