Wednesday, April 27, 2016

HackerRank: Bear Steady Gene (II) - Better Code

March 4, 2016


  Problem statement:

https://www.hackerrank.com/contests/hourrank-6/challenges/bear-and-steady-gene  

A gene is represented as a string of length  (where  is divisible by ), composed of the letters , and . It is considered to be steady if each of the four letters occurs exactly  times. For example,  and are both steady genes.

Julia's 1st practice:

https://gist.github.com/jianminchen/80723bae951328a690bb

score 15 out of 50, there are 2 run time error, failed a few of test cases.

The algorithm ends up in time complexity O(n^2), close to brute force solution - O(n^2)

C# code implementation to study:

Readable code, with some analysis.

https://gist.github.com/jianminchen/fae9142eff9a6f9643fc

Work on two pointers, sliding window, so time complexity is O(2n) = O(n).

Let us go over two pointers algorithm here using example:
GAAATAAA,
A - count of A, denoted as cA = 6, n/4 = 2, so we have to change 4 of A to other thing.
A - 6 -2  = 4 , 4 of change
C - 0 - 2 = -2
T - 1 -2  = -1
G - 1-2  = -1
So, at least minimum is 4, but a substring containing 4 of A, shortest one is AAAA. But "AAAA" is not a substring of "GAAATAAA"

G  A A A T A A A
0
start
end
Let us find the substring starting from 0, but will include all 4 of A, and then, rest of string will not include any char of "ACGT" more than n/4.
GAAATA, string length is 6.
start - 0, index of 0
end - A, index of 5

continue to move start to next one, 1, then, G is moved out from substring, adjust count of chars.
AAATA can be the substring, so the length is min (6, 5) = 5. Do not need to move end pointer.

next step, start = 2, missing one of A, and then, end has to move to next one until the string fits into requirement.

Every thing should be covered in 20 minutes, problem reading, the design of algorithm, the coding.

So, Julia had second practice, and the code scores 50 out of 50 this time.

https://gist.github.com/jianminchen/61dfe437f82edb9793fc

April 27, 2016
Change C# program to make variables more meaningful, matching the design:
code -> GENES <- global string array
function name matches design of two pointers moving.

Improvement 1: 
Third practice: 
https://gist.github.com/jianminchen/b8263048c297473319c23836e9468c14

Improvement 2: 
Fourth practice:
searchStrArray -> searchStrNumbers, more meaningful.
https://gist.github.com/jianminchen/b23c4f606a101b9aeec71eff3268db32

Comment: (April 27, 2016)
Spend some time to read "clean code", get some ideas to write better code. 


No comments:

Post a Comment