Introduction
It is my system design study and research. I like to go over this design called design typehead suggestion in next 2 - 3 hours. I like to take some notes as well.
Case study
I like to write down some thinking process to help myself to understand better about system design.Typeahead suggestion - what is meaning of typeahead suggestion? It is like giving out hint when you types the word. In other words, it tries to predict the query based on the characters the user has entered and gives a list of suggestions to complete the query.
Functional requirement: As the user types in their query, the service should suggest 10 terms starting with whatever the user has typed.
Non-function requirements: The suggestion should appear in real-time. Within 200ms.
Basic system design and algorithm
The service should suggest next terms that will match the given prefix.
Data structure - Trie
How to find top suggestion? given prefix, how can we find the top 10 terms for the given prefix?
One simple solution can be to store the count of searches that terminated at each node.
How much time will it take to traverse its sub-tree?
How to store top suggestions with each node?
Let me think about it, and also understand the ideas in the ...
We can store top 10 suggestions at each node that we can return to the user.
We can optimize our storage by storing only references of the terminal nodes rather than storing the entire phrase.
How would we build this trie?
Build our trie bottom up - post order traversal
Each parent node will recursively call all the child nodes to calculate their top suggestions and their counts. Parent node will combine top suggestions from all of their children to determine their top suggestions.
How to update the trie?
We can have a Map-Reduce (MR) set-up to process all the logging data periodically say every hour.
No comments:
Post a Comment