April 28, 2016
Previous blog on binary search on Bear and Gene algorithm:
Come back to the problem on binary search solution, figure out the design:
Problem statement:
https://www.hackerrank.com/contests/hourrank-6/challenges/bear-and-steady-gene
A gene is represented as a string of length (where is divisible by ), composed of the letters , , , and . It is considered to be steady if each of the four letters occurs exactly times. For example, and are both steady genes.
Study code using binary search:
https://gist.github.com/jianminchen/d01faa03ca9b06696db3
https://gist.github.com/jianminchen/395eb9e76fe19cc9338f
Comment:
Binary search is better than linear search using two pointers.
Brute force O(n^2) -> linear search O(n) -> Binary search O(logn) <- Not True!
Java code:
https://gist.github.com/jianminchen/d01faa03ca9b06696db3
C# code
https://gist.github.com/jianminchen/76dffc51880a80279f25
First, go through two test cases: "GTTCCAAA" and "GAAATTCC"
1. "GTTCCAAA",
Binary search is doing this way:
Biggest value of search string length is 8.
First, divide range from 0 to 8 into half, 4
Find first string with length 4,
one is "GTTC", one is "CAAA",
and then, remove count of first half, rest string (two parts, seperated) "CAAA", since A is repeated 3 times, not in the range;
instead of stopping, wrongly conclude that 4 is not high value, go through all the possible substring with length 4, by sliding window of search string: "GTTC" forward from left to right, but keep the same window size
"GTTC" -> "TTCC"->"TCCA"->stop here, since "TCCA" is removed from counting of GENES="ACGT", fit into the requirement.
Next, low=0, high=4, mid = 2,
Because both test cases with one 'A' to replace, Julia figured out through the debugging:
The slide window of fixed length technique,
How to slide?
Which direction to slide?
Kind of clever in design.
Time complexity analysis: length search using binary search, n - length of string, O(log n) times; each search for length m, go over each string once, since using calculated counting array, only do add one/ remove one at a time, one char only does the work once. So, it is O(n) on this.
Total time complexity: O(n logn)
Conclusion: this binary search is not better than linear search is previous blog (IV).
No comments:
Post a Comment