Tuesday, April 26, 2016

Leetcode 146: LRU - Cache

April 26, 2016

Design Last Recently Used Cache

https://github.com/jianminchen/LeetCode-Java-Solutions/blob/master/146.lru-cache.java

C# implementation with one test case:
https://gist.github.com/jianminchen/3dc7da7e0465819e6aa8c10c71901723

Question and answer:
1. What do you learn through the practice?
Use dummy head and dummy tail to help maintain double linked list as cache.

2. Do not forget to set the node to last one when it is visited.

3. Talk about API design:
AddToLast,
set
get - get method is confusing, since it is also to do repositioning of visited node in the linked list - the cache - better to add another function to let get() call,  function - getKeyAndThenMoveToLast
use Dictionary to hold the key value pair,

4. when a node is added to last, with two dummy node in this double linked list, no temp variable needed to set up new connections. Just be patient, dummy tail's prev point: copy, and break, and set a new one.

May 17, 2016
Julia loves this article talking about cache system design; once she does some code review on basic LRU, she starts to enjoy the computer science, she just loves the engineering, good ideas to solve problems.
http://goo.gl/pL5ee4

classical reader/ writer problem / lock / multiple shards/ commit logs/ memcached

June 2, 2016

5. How to design LRU? Data structure? What for?
copy from blog:http://blog.gainlo.co/index.php/2016/05/17/design-a-cache-system/?utm_source=email&utm_medium=email&utm_campaign=email

LRU

One of the most common cache systems is LRU (least recently used). In fact, another common interview question is to discuss data structures and design of an LRU cache. Let’s start with this approach.
The way LRU cache works is quite simple. When the client requests resource A, it happens as follow:
  • If A exists in the cache, we just return immediately.
  • If not and the cache has extra storage slots, we fetch resource A and return to the client. In addition, insert A into the cache.
  • If the cache is full, we kick out the resource that is least recently used and replace it with resource A.
The strategy here is to maximum the chance that the requesting resource exists in the cache. So how can we implement a simple LRU?

LRU design

An LRU cache should support the operations: lookup, insert and delete. Apparently, in order to achieve fast lookup, we need to use hash. By the same token, if we want to make insert/delete fast, something like linked list should come to your mind. Since we need to locate the least recently used item efficiently, we need something in order like queue, stack or sorted array.
To combine all these analyses, we can use queue implemented by a doubly linked list to store all the resources. Also, a hash table with resource identifier as key and address of the corresponding queue node as value is needed.
Here’s how it works. when resource A is requested, we check the hash table to see if A exists in the cache. If exists, we can immediately locate the corresponding queue node and return the resource. If not, we’ll add A into the cache. If there are enough space, we just add a to the end of the queue and update the hash table. Otherwise, we need to delete the least recently used entry. To do that, we can easily remove the head of the queue and the corresponding entry from the hash table.

Eviction policy

When the cache is full, we need to remove existing items for new resources. In fact, deleting the least recently used item is just one of the most common approaches. So are there other ways to do that?
As mentioned above, The strategy is to maximum the chance that the requesting resource exists in the cache. I’ll briefly mention several approaches here:
  • Random Replacement (RR) – As the term suggests, we can just randomly delete an entry.
  • Least frequently used (LFU) – We keep the count of how frequent each item is requested and delete the one least frequently used.
  • W-TinyLFU – I’d also like to talk about this modern eviction policy. In a nutshell, the problem of LFU is that sometimes an item is only used frequently in the past, but LFU will still keep this item for a long while. W-TinyLFU solves this problem by calculating frequency within a time window. It also has various optimizations of storage.
Skip concurrency and distributed cache in this design. 

Next, talk about coding part - data structure and algorithm:
C# implementation. 
https://gist.github.com/jianminchen/3dc7da7e0465819e6aa8c10c71901723

int capacity - specify the size of cache, cache should be with limited size, since resource is limited, specially for high speed access. source code on line 24.

int size - current size of cache - track current size of cache to determine  if eviction is needed or not.
source code on line 24.

Design a double linked list, so ListNode class is defined with two pointers, prev, next
source code from line 10 - line 22.

every entry has key, value. Use int to simplify the coding. Source code, line 12.
line 12   public int key, val; 

Also, we need to add two more variables: dummy head, dummy tail to help maintain the double linked list. source code on line 25.

Also, we need to be able to find the key in O(1) since it is in cache. Extra space is used, maintain a hash map using Dictionary class in C#:
line 27    Dictionary(int key, ListNode)

Let us count how many variables inside the class LRUCache:
6 variables:    - memorize the variable count - 6 - Try to recall.

private int capacity, size; 
private ListNode dummyHead, dummyTail; 
private Dictionary<int, ListNode> map; 

ListNode class as a node in a double linked list: 
 public int key, value; 
 publie ListNode prev, next; 

4 variables.

June 2, 2016
Warm up the algorithm: (a lot of hurdles, just read source code.)
https://gist.github.com/jianminchen/207e20632f7837285ee9b913b0d34f3a

Second warm up the algorithm:
https://gist.github.com/jianminchen/3bd8cdab7a31662d402c62fff9c0b597




No comments:

Post a Comment