Julia's coding blog - Practice makes perfect

From January 2015, she started to practice leetcode questions; she trains herself to stay focus, develops "muscle" memory when she practices those questions one by one. 2015年初, Julia开始参与做Leetcode, 开通自己第一个博客. 刷Leet code的题目, 她看了很多的代码, 每个人那学一点, 也开通Github, 发表自己的代码, 尝试写自己的一些体会. She learns from her favorite sports – tennis, 10,000 serves practice builds up good memory for a great serve. Just keep going. Hard work beats talent when talent fails to work hard.

Wednesday, January 31, 2018

Leetcode 726: Number of Atoms

January 31, 2018

Introduction

It is the last day of January 2018, I like to celebrate the first month of 2018 using one hard level algorithm. I found one today by reading a Chinese algorithm blog, the link is here. The algorithm is hard level one on leetcode called Number of Atoms. I know that it will take some time for me to understand the algorithm with hard level.

Algorithm study

Plan to study the Chinese blog and write down some notes. One thing I like to do is to read the chinese blog, and go over each word, try to put together and make it more readable first. And then write down a few things to look into, highlight them, and plan to go over it again later.

三种方法，第一种采用递归，而后两种采用栈式结构。

方法一：递归

首先编写一个解析器Parse，这个解析器应当能统计出从当前字符串位置开始，直到遇到一个右括号“）”或读到字符串末尾，然后连带上右括号“）”后面可能跟上的数字，这可做为一段整体。

用一个counter来记录这一段各个元素的个数。最后返回counter。

1. 首先初始化counter
2. 在解析器循环中：
1. 当遇到左括号“（”时，就递归调用Parse；并把Parse子程序中元素统计结果加入当前的counter当中。
2. 如果不遇到左括号，那么会发现一定是大写字母，也就是一个原子名称的开头，那么就统计这个元素的个数。
3. 循环结束后，检查右括号后面是否跟有数字，如果有的话要对当前的counter进行相应倍乘。
4. 最后返回counter。
5. Parse可以写成一个递归函数，然后用主函数做驱动函数。

方法二：采用栈式结构

和第一种方法类似，不采用递归函数，采用栈来进行操作。（其实函数的递归和栈本就具有极大的相似性）
先初始化counter栈，然后仍旧需要一层循环，直到读完所有字符。

循环中：
1. 当遇到左括号“（” 时一层新的counter入栈。
2. 遇到右括号“）”时栈顶counter弹出，并检查后面是否跟有数字，如果有的话对当前counter进行相应倍乘，并把结果累计到新露出的栈顶。
3. 遇到大写字母，就统计这个元素的个数，计入counter，类似方法一。

方法三：同方法二，但是不手动解析字符串，而是采用正则表达式

规则表达式Regular Expression，通常被用来检索、替换那些符合某个模式的文本.

把字符串分成一个一个独立的块，这里采用的表达式为："([A-Z][a-z])(\d)|(()|())(\d)"。这里面的块分别表示：可能带有数字的元素如“Fe3”、“H”等；左括号“（”；可能带有数字的右括号“）”、“）3”等。
然后在循环中对正则表达式中的每块进行各自对应的处理。

复杂度分析

无论哪种方法，理论的复杂度相同。但是这里推荐使用第三种方法，因为栈式结构比递归的写法更加简洁, 还有第二个原因, 正则表达式极大地简化匹配操作。

时间复杂度

O(N^2)

这里N为字符串的长度，遍历一遍字符串需要O(N)的时间，但是当遇到左括号时，会对子式的元素向外累加，这就有可能达到O(N)的时间，而总字符串长度为O(N)，所以左括号的个数为O(N)（注意大O记号的含义是“不超过”，这里的意思是不超过线性界，而不是等于线性界）。故所花时间为O(N^2)，

这里给出一种时间比较逼近N^2的表达式：A(B(C(D(E(F3)4)5)6)7)。

注意：正则表达式的时间界不容易判断，这里正则表达式没有回溯，所以总的时间复杂度不会超过O(N^2)，这与正则表达式的实现方法有关。

空间复杂度

O(N)，这里无论是栈还是Map，空间复杂度不会超过O(N)。