Post Series Structure
This post is the start of a four-part series covering the buildout of an application which showcases an example of creating and editing graphs and calculating their minimum spanning trees in a visual manner.
Part 1 of this post outlines the demo application used throughout this tutorial to showcase graphs, minimum spanning trees, and the integration with the Google Maps Distance Matrix API. This post covers high level examples of graphs, and the real world applications of a Minimum Spanning Tree (MST).
Part 3 of this post addresses the buildout of a Minimum Heap data structure which is a key component of Prim’s MST algorithm.
Part 4 addresses the technical details of Prim’s MST algorithm, its implementation within the overall application, and the time complexity analysis.
The example application can be seen HERE (please note that the initial page load may take up to ten seconds as it is hosted on Heroku’s free plan).
All source code for this application lives HERE
Getting Started with the Graph
The purpose of this demo application is to demonstrate a visual and practical use of the MST algorithm. Before going into application and implementation details, it is helpful to understand graphs (and a MST) at a high level. For a visual example context, and the one used in this application, a graph can be thought of as a map. Each location or place is a point - technically referred to as a Node. Nodes of the graph are connected to one another as Edges. Edges can be ‘directed’ or ‘undirected’. Directed edges form a ‘one-way’ connection between two nodes A --> B whereas undirected edges are a ‘round-trip’ connection between two nodes A <--> B.
Many real world applications model data into graphs. For example, Facebook and LinkedIn can model users into a graph. And real world airports could be modeled as a node within a graph when Kayak goes about finding routes to recommend. Pages of a website can also be treated as nodes of a graph. Our example treats map locations entered by a user as the nodes of a graph.
Edges and Edge Weights
Just as Nodes represent the points of a graph, the edges represent the connections between the nodes. In the example of Facebook or Linked, friendship or first degree connections are represented by edges connecting users.
In a quick map based example, let’s take five cities – Seattle, Los Angeles, Boston, St. Louis, and Miami. These are the Nodes of our new Graph. Any connections between them represent our Edges. In the application here, the driving distance between each city is the edge weight. With the Google Maps API, this is expressed in meters (and uses the driving distance by shortest route). So the edge from St. Louis to Seattle would be 3,344,055 meters. For purposes of this application, the map is implemented as an undirected graph, meaning that driving from St. Louis to Seattle and driving from Seattle to St. Louis are the same. As a result, there are (N * (N – 1) / 2) undirected edges connecting each of the nodes. This allows us to use Prim’s Minimum Spanning Tree algorithm discussed below in detail.
Before moving on, it is important to note that even though this example uses an undirected graph and considers driving between two cities as the same, this is not technically equivalent. Driving from St. Louis to Seattle is 3,344,055 meters, but the reverse route is only 3,342,228 meters – a difference of almost 2km. This example application could be implemented as a directed graph with each driving route being two directed edges. For plotting purposes visually, and implementation of Prim’s algorithm, I have decided to take the first distance discovered between the two points and use it interchangeably.
A more explicit example of a directed graph would be the same map of cities but with flight prices between each city as the edge weights. This would have to be a directed graph as flying from St. Louis to Seattle could be $200, whereas the reverse could be $500 depending on the situation. There is no ‘switching the graph to be undirected’ for this use case.
Minimum Spanning Tree & Prim’s Algorithm
Now that we have our undirected graph of locations – the nodes - and their edge weights – the distance between them – we can calculate the minimum spanning tree. The MST is defined as the route that connects all nodes of the graph with the minimum possible total sum of edge weights. In the example of the four cities above, the MST would be the edges of Seattle <--> Los Angeles, Los Angeles <--> St. Louis, St. Louis <--> Boston, and St. Louis <--> Miami.
The commercial applications of this algorithm can address question such as:
- What is the least amount of equipment needed to connect a network of power stations?
- What is the least amount of telephone/internet cable needed to connect office locations of a company?
This application uses locations on a map to visually highlight the MST and its application to a map. The next three posts will walk through the build out of the example application, its integration with the Google Maps API, and the implementation of Prim’s algorithm to find the MST of a given graph.