Saturday, March 5, 2016

HackerRank: Bear and Steady Gene algorithm (III)

March 5, 2016

Problem statement: 


A gene is represented as a string of length  (where  is divisible by ), composed of the letters , and . It is considered to be steady if each of the four letters occurs exactly  times. For example,  and are both steady genes.


Julia's 1st practice:

Code is here

Score 15 out of 50, there are 2 run time error, failed a few of test cases.

The algorithm ends up in time complexity O(n2), close to brute force solution.

Code study 

C# code implementation to study:

Study code is here

Work on two pointers, sliding window, so time complexity is O(n).

Let us go over two pointers algorithm here using example:

Test case string "GAAATAAA", gene string "ACTG". 
Go over C# program a few lines of statements here:

Line 40
var a = ReadToken().Select(ch=>"ACTG".IndexOf(ch)).ToArray();

so the array a = [3, 0, 0, 0, 2, 0, 0, 0] is representing the sample string "GAAATAAA", since 'G' is the index of 3 in the string "ACTG", 'A' is 0, 'C' is 1, 'T' is 2. 

Line 44 - 45:
for(int i = 0; i < 4; i++)
   target[i] = Math.Max(0, a.Count(aa => aa == i) - n / 4); 

So the Target[] = [4, 0, 0, 0]. In other words, 4 of A is needed, but C, T, G do not matter.

Line 47 - 51:
if (target.All(t => t == 0))

If all element in Target is 0, then return 0. Learn to use Array.All method. 

Left, right two pointer (l, r), starting from 0

Declare a sum array

Start a loop, loop on r
Add a[r]’s char into sum array,

r ++, r moves to next one

Inside the first loop, another loop for r, continue to move forward

if sums do not reach target 4 of them >= target
The test case "GAAATAAA", find substring: "GAAATA". 

Inside the first loop, another loop for left pointer – l

condition: l  <  r

If sums array is bigger than target array, then remove left pointer char count,

then move left pointer to next one, then the substring is reaching the target,


and compare to existing smallest length

"AAATA" is the smallest one.

This algorithm is much easy to follow than the one in the blog - bear and steady gene.   


C# code
 implementation to study, code is here

No comments:

Post a Comment