Wednesday, October 21, 2020

Heap + greedy algorithm: 358 Rearrange string k distance

 Here is the gist. 


Case study 

Work on string "aaadbbcc" and distance k = 2. 

First of all, count characters in the string and save them into an array with size 256. 

Next it is to work on greedy algorithm design, the character with most occurrences will be selected first, and then continue to iterate all other characters. One of ideas is to use maximum heap, so that 'a' with occurrence 3 will be selected first, and it will be removed from maximum heap as well. 

One thing is to design k distance away. I stumbled on this requirement for long time, after I studied the code written in Java and then wrote one using C#. It becomes very clear that it can be done quick and easy. Add a double linked list using C# LinkedList acting as a Queue, when the size is k, then first character in the queue should be added back to maximum heap. 

I also think about maximum heap, should I limit the heap size to k, or I can add all characters in the heap instead. Actually most of important is to add character with occurrence back to maximum heap. Since total distinct characters are 256, it is no problem to load all of them in heap one time at the beginning. 


using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Threading.Tasks;
namespace _358_Heap_LinkedList
{
class Program
{
static void Main(string[] args)
{
var s = "aaadbbcc";
var result = rearrangeString(s, 2);
}
/// <summary>
/// Oct. 21, 2020
/// code study
/// https://protegejj.gitbook.io/algorithm-practice/leetcode/heap/358-rearrange-string-k-distance-apart
/// First count character and it's occurrence. Declare array size 256.
/// Design a maximum heap and put all the character's into maximum heap, define sorting rules.
/// Design an extra buffer for removed characters using double linked list, if the size of list is k,
/// then the head of list is ready to join maximum heap again to compete for another round of removal.
/// Data structures used for the design using C#:
/// 1. Tuple<char, int> - char, count
/// 2. Maximum heap - SortedSet
/// 3. Buffer - using LinkedList<Tuple<char, int>>
///
/// </summary>
/// <param name="s"></param>
/// <param name="k"></param>
/// <returns></returns>
public static string rearrangeString(string s, int k)
{
if (k == 1)
{
return s;
}
var count = new int[256];
foreach (var item in s)
{
count[item - 'a']++;
}
// Create a sorted set using the ByOccurrences comparer.
var maxHeap = new SortedSet<Tuple<char, int>>(new ByOccurrences());
for (int i = 0; i < 256; i++)
{
if (count[i] != 0)
{
maxHeap.Add(new Tuple<char, int>((char)('a'+i), count[i]));
}
}
var buffer = new LinkedList<Tuple<char, int>>();
var result = new StringBuilder();
while(maxHeap.Count > 0)
{
var maximum = maxHeap.Last();
maxHeap.Remove(maximum); // time complexity O(1)?
result.Append(maximum.Item1);
var number = maximum.Item2 -1;
buffer.AddLast(new Tuple<char, int>(maximum.Item1, number));
// add a character back to maximum heap
if (buffer.Count == k)
{
var first = buffer.First; // LinkedListNode
buffer.RemoveFirst();
if (first.Value.Item2 > 0) // check LinkedListNode.Value
{
maxHeap.Add(first.Value);
}
}
}
return result.Length == s.Length ? result.ToString() : "";
}
// https://docs.microsoft.com/en-us/dotnet/api/system.collections.generic.sortedset-1?view=netcore-3.1
// Defines a comparer to create a sorted set
// that is sorted by the file extensions.
public class ByOccurrences : IComparer<Tuple<char, int>>
{
public int Compare(Tuple<char, int> x, Tuple<char, int> y)
{
var valueX = x.Item2;
var valueY = y.Item2;
if (valueX != valueY)
{ // ascending order by number - maximum heap
return valueX - valueY;
}
else
{ // lexicographic descending order by character - maximum heap
return (y.Item1 - x.Item1);
}
}
}
}
}

No comments:

Post a Comment