March 5, 2016
Problem
statement:
A gene is
represented as a string of length (where is
divisible by ),
composed of the letters , , , and . It is
considered to be steady if each of the four letters occurs exactly times.
For example, and are
both steady genes.
Practice
Julia's 1st practice:
Code is here.
Score 15
out of 50, there are 2 run time error, failed a few of test cases.
The
algorithm ends up in time complexity O(n2), close to brute force solution.
Code study
Work on
two pointers, sliding window, so time complexity is O(n).
Let us go
over two pointers algorithm here using example:
Test case string "GAAATAAA", gene string
"ACTG".
Go over C# program a few lines of statements here:
Line 40:
var a = ReadToken().Select(ch=>"ACTG".IndexOf(ch)).ToArray();
Go over C# program a few lines of statements here:
Line 40:
var a = ReadToken().Select(ch=>"ACTG".IndexOf(ch)).ToArray();
so the array a =
[3, 0, 0, 0, 2, 0, 0, 0] is representing the sample string "GAAATAAA", since 'G' is the index of 3 in the string "ACTG", 'A' is 0, 'C' is 1, 'T' is 2.
Line 44 - 45:
for(int i = 0; i < 4; i++)
target[i] = Math.Max(0, a.Count(aa => aa == i) - n / 4);
So the Target[] = [4, 0, 0, 0]. In other words, 4 of A is needed, but C, T, G do not
matter.
Line 47 - 51:
if (target.All(t => t == 0))
{
Write(0);
return;
}
If all
element in Target is 0, then return 0. Learn to use Array.All method.
Left, right two pointer (l, r), starting from 0
Declare a sum array
Start a loop, loop on r
Add
a[r]’s char into sum array,
r ++, r
moves to next one
Inside
the first loop, another loop for r, continue to move forward
if sums do
not reach target 4 of them >= target
The test case "GAAATAAA", find substring: "GAAATA".
Inside the first loop, another loop for left pointer – l
condition:
l < r
If sums
array is bigger than target array, then remove left pointer char count,
then move
left pointer to next one, then the substring is reaching the target,
"GAAATA" => "AAATA"
and compare to existing smallest length
"AAATA" is
the smallest one.
This algorithm is much easy to follow than the one in the blog - bear and steady gene.
Reference:
C# code implementation to study, code is here.
No comments:
Post a Comment