Proposal

Title


Parallel four-color Map Solver
Teammates: Chenfei Lou, Yuqing Qiu

URL


Project Home Page

Summary


We are going to implement a parallel map solver for the four-color problem using openMP and perform a detailed analysis of performance characteristics. We will measure time and memory costs on different region maps and compare the performance between different scheduling algorithms.

Background


The four-color problem derives from four-color theorem, which states that any map is colorable with 4 different colors, such that after coloring, any two adjacent countries have different colors. There is a heuristic algorithm for it called welsh-powell algorithm. However, it does not guarantee to succeed. When it fails, we need to use backtracking to do a brute-force search.

There is space for acceleration in both parts. For the heuristic algorithm, it constructs a graph and then sorts the vertices based on their valence, which is time-consuming when the graph is large. We can partition the graph into grids, sort vertices in each sub-graph in parallel, then merge them to a global sorted list. For the backtracking algorithm, it follows a recursive pattern and is compute-intensive. We plan to use fork-join tools such as openMP for realizing parallelsim.

Challenge


  1. The heuristic algorithm is intrinsically difficult to parallelize due to high dependencies. It consists of two phases, sorting and coloring. The second phase is a traversal over all nodes where the decision in each iteration depends on the result in all previous iterations. Such sequential dependencies prevents itself from being parallelized. The first sorting phase has less strict dependencies among its steps, thus giving more opportunities to parallelize (e.g. merge sort algorithm), but an efficient algorithm that minimizes overhead and maximizes memory locality is still non-intuitive to design.

  2. Though backtracking algorithm follows recursive execution, therefore being suitable for fork-join parallelization patterns, there is one more complexity -- the program does not need to wait all the tasks to be completed, instead it may stop as soon as one thread finds a valid solution. Therefore we need to design a way that could stop all other running/waiting tasks once a solution is found, which is not intuitive to implement.

Resources


We refer to the material here, which explains the mechanism of welsh-powell algorithm in details. For implementation, we will start from the code base for an existing map solver. It is an interactive platform that prompts users to draw a map by themselves and the backend will execute the algorithm to color the map with four colors. We will benefit from its implementation of the algorithm, which includes both the welsh-powell algorithm and the backtracking. For testing, We will utilize the Pittsburgh Supercomputer Cluster (PSC) for collecting runtime statistics.

Goals and Deliverables


Plan to Achieve

For the heuristic algorithm, we aim to speedup the sorting part by 2x considering the possibility of false sharing, load imbalance and communication overhead. For the backtracking algorithm, we aim to speedup by 3x because the tasks are independent, thus creating more space for parallelism. However, it is restricted by the scheduling overhead. We plan to exploit different scheduling algorithms in openMP, including static, dynamic and guided. We try to find the best one based on the the tradeoff between workload balance and synchronization overhead.

Poster Session

For the poster session, we plan to demo with an four-color map animation and present it with interaction. We will present users with a white board and ask them to draw a map. Then, we will color the map with our algorithm and show it to users. We will run both original sequential algorithm and our optimized algorithm to collect runtime statistics. Finally, we will show users with graphs displaying the speedup we achieve.

Hope to Achieve

We hope to achieve higher speedup (3x for heuristic algorithm and 5x for backtracking algorithm) if we are ahead of schedule. Besides, we hope to seek for space in accelerating the coloring phase of the heuristic algorithm. Since the iterations in this phase are dependent are each other, this poses a great challenge for us.

Platform


We will use Pittsburgh Supercomputer Cluster (PSC) for evaluation because it provides large number of CPUs, which helps us achieve high parallelism. We decide to use OpenMP because it support shared address space and enable us to test different scheduling algorithms easily. We plan to write code in C++ because it has more compatibility with openMP.

Schedule


Week Plan
Week 1 (3/27 - 4/2) Research on the topic and write proposal.
Week 2 (4/3 - 4/9) Implement a sequential version and successfully run on PSC. Design how to speedup the sorting phase in heuristic algorithm.
Week 3 (4/10 - 4/16) Implement the parallelized version of sorting in heuristic algorithm and optimize it with different workload mapping algorithms.
Week 4 (4/17 - 4/23) Implement the parallelized version of the backtracking algorithm. Write milestone report.
Week 5 (4/24 - 4/30) Optimize the parallelized backtracking implementation with different scheduling algorithms. Generate test cases to evaluate the one with best tradeoff.
Week 6 (5/1 - 5/5) Write final report and prepare for the poster session.