Sunday, April 24, 2022

Leetcode discuss: 295. Find Median from Data Stream

 April 24, 2022

Here is the link. 

C# | Minimum heap and maximum heap | SortedSet<Tuple<int, int>>

April 22, 2022
Maximum heap and minimum heap design
It is important to come out best time complexity using two heaps, one is maximum heap, one is minimum heap. So it is O(1) time complexity to find median value. That is most efficient way to find median value in a stream.

Prepare a check list

  1. Using C# Tuple<int, int> to build a minimum and maximum heap. The first one in Tuple is value, and the second one is the counter variable name value, which is helpful to keep duplicate value and make it unique since counter variable has identity value to increment one always.
  2. Design two heaps, one is maximum heap, one is minimum heap; The small half numbers are stored in maximum heap, using C# SortedSet<Tuple<int, int>>, called smallHalf;
  3. If the total count of numbers are even, then smallHalf and bigHalf has same number of integers; otherwise I choose to keep extra one into smallHalf always. It is better to change smallHalf to smallHalfPlusOne to remind myself, so it is easy for me to code, for example, add extra number to smallHalfPlusOne, get medium number from smallHalfPlusOne if total count is odd. In other words, easy to figure out, easy to avoid mistakes in the code.
  4. Variable naming: smallHalfPlusOne, bigHalf variables names are more meaningful compared to setLow, setHigh. Design minimum heap and maximum heap using C# SortedSet<Tuple<int, int>>.
  5. First step is to check which heap to add the incoming integer, next step is to move one number to another heap to maintain two heap's count's difference is at most one, and also if it is not equal, smallHalfPlusOne should have an extra integer.

Warmup for Meta onsite in May 2022
I chose to work on this algorithm to prepare Meta onsite in May 2022. I try to figure out what to learn from this practice. I wrote down my detail - 30 days to meta onsite here Day 18 - I chose to go over 15 algorithms from Stefan Pochmann.

The following C# code pass online judge.

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Threading.Tasks;

namespace _295_Find_median_of_stream
{
    class Program
    {
        static void Main(string[] args)
        {
        }

        public class MedianFinder {

   private int counter = 0;

        private SortedSet<Tuple<int, int>> smallHalfPlusOne = new SortedSet<Tuple<int, int>>();
        private SortedSet<Tuple<int, int>> bigHalf = new SortedSet<Tuple<int, int>>();        

        public void AddNum(int num)
        {
             var newNum = new Tuple<int, int>( num, counter++ );            
            
            // Deal with empty minimum heap or maximum heap case 
            if ( smallHalfPlusOne.Count == 0 || newNum.Item1 < smallHalfPlusOne.Max.Item1)
            {
                smallHalfPlusOne.Add(newNum);
            }
            else
            {
                bigHalf.Add(newNum);
            }

            // There is more than one numbers in smallHalfPlusOne
            while (smallHalfPlusOne.Count > bigHalf.Count + 1)
            {
                bigHalf.Add(smallHalfPlusOne.Max);
                smallHalfPlusOne.Remove(smallHalfPlusOne.Max);
            }
            
            // bigHalf has more numbers than smallHalfPlus 
            while(bigHalf.Count > smallHalfPlusOne.Count)
            {
                // move the minimum number from setHigh to setLow. 
                smallHalfPlusOne.Add(bigHalf.Min);
                bigHalf.Remove(bigHalf.Min);                
            }                     
        }

        /// <summary>
        /// if minimum heap and maximum heap have same size, then medium is to get the average of those two values 
        /// </summary>
        /// <returns></returns>
        public double FindMedian()
        {
            if (smallHalfPlusOne.Count == 0)
            {
                return 0;
            }

            if (smallHalfPlusOne.Count == bigHalf.Count)
            {
                return (smallHalfPlusOne.Max.Item1 + bigHalf.Min.Item1) / 2d;
            }
            else
            {
                return smallHalfPlusOne.Max.Item1;
            }
        }
}


No comments:

Post a Comment