Saturday, April 30, 2016

coursera course: Algorithm

April 30, 2016

Come cross this course, and then, Julia likes to use it more often in the study of algorithm:

https://class.coursera.org/algo-003/lecture

Take notes the videos watched:

April 30, 2016 20 minutes:
https://class.coursera.org/algo-003/lecture/52

May 1, 2016 1 -2 hours
graph

Leetcode 210: course schedule

April 30, 2016

Read the problem and analysis, read python code 10 - 20 minutes.

http://www.tangjikai.com/algorithms/leetcode-207-208-course-schedule-i-ii

https://github.com/jianminchen/LeetCode-Java-Solutions/blob/master/210.course-schedule-ii.java

Understand the problem and solution quickly through this blog: (10 minutes to read)

http://www.voidcn.com/blog/qq508618087/article/p-5039267.html

Another 10 minutes to read this blog:

http://www.cnblogs.com/grandyang/p/4504793.html

Spent 20 minutes to watch the video:
https://class.coursera.org/algo-003/lecture

Take some notes:
1. Sink Vertex - concept
2. To compute topological ordering:
- Let v be a sink vertex of G
- set f(v) = n
- recurse on G-{v}

why does it work? when v is assigned to position i

Topological Sort vis DFS
DFS-Loop (graph G)
- mark all nodes unexplored
- current_label = n [to keep track of ordering]
- for each vertex v in set G:
- if v not yet explored [ in some previous DFS call]
- DFS( G, v)

DFS( graph G, start vertex s)
- mark s explored
- for every edge (s, v):
- if v not yet explored
- DFS (G, v)
- Set f(s) = current_label
- current_label --

Read an article: (1:04pm - 1:14pm)
http://www.geeksforgeeks.org/topological-sorting/

Fun part later:

play with visual presentation through the link:
https://www.cs.usfca.edu/~galles/visualization/TopoSortDFS.html

C# code for Leetcode 210:

https://gist.github.com/jianminchen/5873fc014e806d626807917959e05ea2

update C# code to add the example in the comment, and then, update code to match the example:
https://gist.github.com/jianminchen/85478d1bed0faf4410b76a7af3a48fc3

What do you learn today? And what is idea to solve this course schedule problem?

1. First, Julia learned how to do the topological sort, here is an example with its diagram:

The image should be the following: directed edge 0->1, which means that 0 is prerequisite course.
Double check with geekforgeeks article:

Confused since it is the first time to draw the diagram.

So, always start from nodes with 0 indegree (node 0, node 4).
So, the ordering can be 0, 1, 2, 3, 4; or 4, 0, 1, 2, 3

2. So, Julia also practices to do some counting for vertex.
vertex 0: indegree 0, but dependency list: 1, 2, 3
vertex 1: indegree 1, but dependency list: empty set
...
So, Julia learns to manage the graph using indegree array for the graph, and dependency list for each vertex.

3. Julia also learns the easy way to do toplogical sorting using the above graph, and also make the code easy to read, once you remember the graph, you can read the code in 5 minutes.

So, I updated the version of C# code to match this test case:

add explanation variable - line 51:
int[] tmpDep = new int { prerequisites[i] , prerequisites[i] };
tmpDep - 0 is the index, similar to the above diagram node 0, add dependency list: 1, 2, 3
tmpDep - 1 is the index, similar to the above diagram node 1.

4.  Now, understanding one example. Ready to talk about idea of problem solving.

The idea to solve this course schedule problem is first for each vertex to get indegree value and also dependency list.

Do not get confused with dependency.

In above diagram, the directed graph,  course 1 should take after course 0, so course 1 is depending on course 0; course 1 has indegree 1 related from course 0. course 0 has no indegree.  If course 2, 3 also has to take after course 0, so course 0 has dependency list: 1, 2, 3. If course 0 is dequeue, then, course 0's dependency list: 1, 2, 3, 3 courses' indegree value has to be decremented by one.

Secondly, to find all nodes in the graph with indegree's value 0; put them in the queue {4, 0}, and then, dequeue one by one, add to the result list; and then, each node in the dependency list will decrease the indegree value by 1; and then, if any node is found with 0 indegree value, then, it will be added to the queue as well.

In other words, here are steps:
1. Add nodes with indegree value 0 to the queue;
2. if queue is not empty, dequeue one in the front,
check dependency list one by one,
decrement one on indegree value,
if the node's indegree value is 0, then add to the queue

statistics:
Time spent: 5+ hours

April 30, 2016

Introduction

It is time for me to practice the course schedule algorithm. The algorithm is using topological sort. I like to go over two blogs and then apply those two blogs's learning, write my own practice.

Blog study

I plan to study the code later, here is the Java code's link. I like to read the blog about this problem. And also I like to write down some study notes here. My C# code is here.

Leetcode 200: Number of Island

April 29, 2016

problem statement:

http://blog.csdn.net/ljiabin/article/details/44975717   (DFS, recurisve, the code is very readable)

https://github.com/jianminchen/LeetCode-Java-Solutions/blob/master/200.number-of-islands.java

Thursday, April 28, 2016

What are the best programming blogs?

April 28, 2016

Julia likes to do some research on best programming blogs, so she starts to read the article:

https://www.quora.com/What-are-the-best-programming-blogs

She may find something to share - blogs to read.

http://www.hanselman.com/blog/GreatestHits.aspx

Look into the blog later:

http://sportprogramming.blogspot.ca/2014/07/getting-started-with-sport-of.html

April 28, 2016

One of Julia's new favorite is to read JavaScript coding submissions on HackerRank, she loves to read how people solve problem using JavaScript. How creative it is, help her expand her thinking power in problem solving, also help her to expedite on JavaScript learning.

You never know, sometimes, best players write code using JavaScript on HackerRank only; if you do not read it, you miss great chance to learn a few things in problem solving.

However, back in 2014, 2015, Julia learned JavaScript by book reading.

Julia spent over 40 hours to read "Testable JavaScript" in 2015.

She also likes to blog her favorite book: "Testable JavaScript" here. She likes a few things about unit test, how to set up test case through book reading.

Will come back.

Come back to the problem on binary search solution, figure out the design:

Problem statement:

A gene is represented as a string of length  (where  is divisible by ), composed of the letters , and . It is considered to be steady if each of the four letters occurs exactly  times. For example,  and are both steady genes.

Study code using binary search:

https://gist.github.com/jianminchen/d01faa03ca9b06696db3

https://gist.github.com/jianminchen/395eb9e76fe19cc9338f

Comment:
Binary search is better than linear search using two pointers.

Brute force O(n^2) -> linear search O(n) -> Binary search O(logn) <- Not True!

Java code:
https://gist.github.com/jianminchen/d01faa03ca9b06696db3
C# code
https://gist.github.com/jianminchen/76dffc51880a80279f25

First, go through two test cases: "GTTCCAAA" and "GAAATTCC"

1. "GTTCCAAA",

Binary search is doing this way:
Biggest value of search string length is 8.
First, divide range from 0 to 8 into half, 4
Find first string with length 4,
one is "GTTC", one is "CAAA",
and then, remove count of first half, rest string (two parts, seperated) "CAAA", since A is repeated 3 times, not in the range;
instead of stopping, wrongly conclude that 4 is not high value, go through all the possible substring with length 4, by sliding window of search string: "GTTC" forward from left to right, but keep the same window size
"GTTC" -> "TTCC"->"TCCA"->stop here, since "TCCA" is removed from counting of GENES="ACGT", fit into the requirement.

Next, low=0, high=4, mid = 2,

Because both test cases with one 'A' to replace, Julia figured out through the debugging:

The slide window of fixed length technique,
How to slide?
Which direction to slide?

Kind of clever in design.
Time complexity analysis: length search using binary search, n - length of string, O(log n) times; each search for length m, go over each string once, since using calculated counting array, only do add one/ remove one at a time, one char only does the work once. So, it is O(n) on this.

Total time complexity: O(n logn)
Conclusion: this binary search is not better than linear search is previous blog (IV).

Wednesday, April 27, 2016

The art of readable code - book review second time

April 27, 2016

Julia spent one month to go over the book, use it to do some code review in 2014. So, write down things learned from her favorite book:

chapter 5: Knowing what to comment
chapter 6: Making comments precise and compact
chapter 7: Make control flow easy to read <- 9 out of 10
chapter 8:  Break Down giant expression  <- 10 out of 10
chapter 10: Extracting unrelated subproblems <- Favorite (8/ 1-10)
chapter 11: One task a time <- Should follow the idea more closely

Later, compare to "Clean Code" book, write a few words to share.

http://a-nickels-worth.blogspot.ca/2016/04/a-guide-to-naming-variables.html

April 27, 2016

Julia was recommended to read this book "Clean Code". So, plan to read the book in next 2 weeks.

Put some notes together on this blog.

http://goo.gl/hHBZPS

Short slides about the book: clean code

Nov. 15, 2016
Spent 1+ hour to read the chapter 3:
Chapter 3 - Functions
small
do one thing
one level of Abstraction per function

Julia likes to google and work on those 3 key concepts.

Keyword:
one level of Abstraction per function

http://principles-wiki.net/principles:single_level_of_abstraction

March 4, 2016

Problem statement:

A gene is represented as a string of length  (where  is divisible by ), composed of the letters , and . It is considered to be steady if each of the four letters occurs exactly  times. For example,  and are both steady genes.

Julia's 1st practice:

https://gist.github.com/jianminchen/80723bae951328a690bb

score 15 out of 50, there are 2 run time error, failed a few of test cases.

The algorithm ends up in time complexity O(n^2), close to brute force solution - O(n^2)

C# code implementation to study:

https://gist.github.com/jianminchen/fae9142eff9a6f9643fc

Work on two pointers, sliding window, so time complexity is O(2n) = O(n).

Let us go over two pointers algorithm here using example:
GAAATAAA,
A - count of A, denoted as cA = 6, n/4 = 2, so we have to change 4 of A to other thing.
A - 6 -2  = 4 , 4 of change
C - 0 - 2 = -2
T - 1 -2  = -1
G - 1-2  = -1
So, at least minimum is 4, but a substring containing 4 of A, shortest one is AAAA. But "AAAA" is not a substring of "GAAATAAA"

G  A A A T A A A
0
start
end
Let us find the substring starting from 0, but will include all 4 of A, and then, rest of string will not include any char of "ACGT" more than n/4.
GAAATA, string length is 6.
start - 0, index of 0
end - A, index of 5

continue to move start to next one, 1, then, G is moved out from substring, adjust count of chars.
AAATA can be the substring, so the length is min (6, 5) = 5. Do not need to move end pointer.

next step, start = 2, missing one of A, and then, end has to move to next one until the string fits into requirement.

Every thing should be covered in 20 minutes, problem reading, the design of algorithm, the coding.

So, Julia had second practice, and the code scores 50 out of 50 this time.

https://gist.github.com/jianminchen/61dfe437f82edb9793fc

April 27, 2016
Change C# program to make variables more meaningful, matching the design:
code -> GENES <- global string array
function name matches design of two pointers moving.

Improvement 1:
Third practice:
https://gist.github.com/jianminchen/b8263048c297473319c23836e9468c14

Improvement 2:
Fourth practice:
searchStrArray -> searchStrNumbers, more meaningful.
https://gist.github.com/jianminchen/b23c4f606a101b9aeec71eff3268db32

Comment: (April 27, 2016)
Spend some time to read "clean code", get some ideas to write better code.