This post is part 3 of 4 in a Series
This post addresses an overview and technical implementation of the heap data structure (in Ruby), which will be used as an important part of Prim’s algorithm for finding the MST, which is used in the demo application.
Parts 1 and 2 of the 4 part tutorial laid the groundwork for building the UI, the graph classes, and Google Maps integration found in the application. Now it is time to explore the data structures specific to the purpose of this application: Prim’s algorithm for finding the Minimum Spanning Tree. Before going into the implementation details, there is one more component to implement that serves as a foundational piece for this algorithm. That is the implementation of a Min-Heap data structure. This data structure will allow us to achieve better time complexity with respect to the algorithm.
Before going into my explanation below, please see Brian Storti's resources he has made available HERE. Much of this post draws from his examples. I used many of his methods in my example as well.
What is a Min-Heap?
A min-heap is a binary tree data structure in which the data of each node is less than or equal to the data of that node’s children and the tree is complete. This requires an example to more clearly see the details:
In this example, all of the nodes fit the above properties described. The Min-Heap can be implemented with an array data structure. The min-heap will have the following properties in its array implementation:
- The children of an element at a given index i will always be in 2i and 2i + 1.
- The parent of a node will be at the index i/2.
- All child nodes are larger than their parent
- The smallest element of the min-heap is at the root
What are the Real-World Applications of a Min-Heap / Max-Heap?
Before going into additional details, operations, and analysis of a min-heap it is helpful to understand the importance and application of this data structure. Example applications all have to do with a ‘Priority Queue’:
- Taking Emergency Room patients in order of the severity of their medical condition
- Triaging bugs based on their impact to the business
- Completing new features based on their ROI/time value to the business
- Scheduling tasks based on importance
- Catering to customers based on the highest business value
Min-Heap Operations and Time Complexity
The reason for using a Min-Heap in the context of this application has to do with the time complexity of the associated operations. There are several operations which are used, which are explained below. For additional details and examples, please refer to any of the resources mentioned above or online.
# min_heap.rb class MinHeap attr_accessor :elements, :element_position_map # initialize an empty array with nil at the 0 index (to make math easier) def initialize @elements = [nil] @element_position_map = Hash.new end end
First the public methods of Min-Heap:
Adding an element to a min-heap puts an element in the last position of the array implementation. But that alone does not guarantee that the heap properties are kept intact (the properties of each child note being greater than the parent). So to ensure the ordering is preserved a helper ‘sift-up’ method is called recursively until the properties are all satisfied – meaning that the element at a given position is large than its parent. The sift up method exchanges elements within the array, each time passing back the parent index of the starting node’s index.
Every time this method is called, we advance up the tree (towards the root) one level. Because this a full binary tree, this advancement effectively reduces the number of nodes by 2. As a result of this, the overall time complexity of this operation is the depth of the binary tree O(log(n)).
# override add to array method - preserve the min heap property def <<(element) @elements << element @element_position_map[element.node_data] = @elements.size - 1 sift_up(@elements.size - 1) end
The extract min method removes the element at the first position of the array. Once the element is removed, the array must again be shifted to meet the min-heap properties. Therefore this method exchanges the element in the first position with that in the last position. The element is then popped off the array and the helper method ‘sift_down’ is called. Much like the previously mentioned ‘sift_up’ helper method, this helper method recursively exchanges elements in the array until they all satisfy the heap properties.
The specifics of the sift_down method require checking both children of an element. The children can be found at index * 2 and index * 2 + 1. The ‘exchange_element’ method is called in each sift_down for the smaller of the two children.
Just like the ‘sift_up’ method, sifting down puts us down one level in the binary tree, and effectively reduces the problem size by a factor of two. The overall time complexity of the extract min method is O(log(n)).
def extract_min # exchange the minimum element with the last one in the list exchange(1, @elements.size - 1) # remove the last element min_element = @elements.pop @element_position_map.delete(min_element.node_data) # make sure the tree is ordered - call the helper method to sift down the new root node into appropriate position sift_down(1) # return the min element return min_element end
There are four methods which are all O(1) constant time. These include count, peek_min, contains_element, and elements.
Lastly, an additional method I added is delete_element. This method is required in prim’s algorithm. Searching in a min-heap is not efficient. While the elements all confirm to the min-heap properties, they are not ordered like a binary search tree (where searching for an element is O(log(n)). Therefore, additional bookkeeping is needed to ensure that the delete_element method does not have to search the entire array, which would be O(n) time complexity.
To implement this method, the min-heap contains an @element_position_map Hash which stores the position of each element. When the delete element method is called, an element is passed in as the parameter and the index of that element can be found as a constant time lookup from the hash. From there, the method is the same as the extract_min method and the sift down method is invoked with the returned index. The exchange_element helper method ensures that element positions are updated with any swaps. Like the earlier Graph class, because a Hash is used for position mapping, duplicate elements are not permitted.
def delete_element(element) element_position = @element_position_map[element.node_data] unless element_position.nil? # exchange the minimum element with the last one in the list exchange(element_position, @elements.size - 1) # remove the last element min_element = @elements.pop @element_position_map.delete(min_element.node_data) # make sure the tree is ordered - call the helper method to sift down the new root node into appropriate position sift_down(element_position) return min_element end end
The min-heap implements a binary tree in an array and allows for O(log(n) time complexity for extract_min, delete_element, and add_element methods. This data structure is used in many real world applications and is an important component of prim’s MST algorithm, which is explored in the next post.
The source code for the above can be seen HERE
Unit tests for the implementation can be seen HERE