Difference between revisions of "MergeSort"

From CSE231 Wiki
Jump to navigation Jump to search
Line 33: Line 33:
  
 
{{Tip|Being able to perform operations like merge sort combine in parallel often comes down to the question "How can I figure out where to put the current item?" }}
 
{{Tip|Being able to perform operations like merge sort combine in parallel often comes down to the question "How can I figure out where to put the current item?" }}
 
 
<spoiler show="spoiler" hide="spoiler">You can divide one side (choose the longer of the two) and then binary search for the value in the other side.</spoiler>
 
<spoiler show="spoiler" hide="spoiler">You can divide one side (choose the longer of the two) and then binary search for the value in the other side.</spoiler>
  

Revision as of 04:06, 29 January 2018

Motivation

Sorting is a problem that is well solved by divide and conquer algorithms. Merge sort is a elegant example which can be parallelized in a straight-forward manner.

Finally, parallelization of the combine step, while not trivial, is possible (and left as an optional fun exercise for those so inclined).

Background

In computer science, merge sort refers to a sorting algorithm which splits an array of data into continuously smaller halves until the arrays reach a size of 1, after which the elements are continuously compared and merged into a sorted array. For more information on how this process works, visit this page. If you took CSE 247, this algorithm should look extremely familiar to you.

Visualization

If you've never seen mergesort before, it may be helpful to take a look at this page which gives a visual demonstration of the algorithm in practice.

Utilities

Both the sequential and parallel merge sorts will be passed a Combiner. When you are done with the divide and conquer phases, invoke combiner.combineRange(data, min, mid, maxExclusive) to merge the two sorted sub-problem results.

Code to Implement

You will need to implement merge sort sequentially and in parallel, but you will need to do both implementations recursively. The kernel method should call itself using recursion, but each public mergesort method should only call its kernel once to do the work.

class: MergeSort.java Java.png
methods: sequentialMergeSort
sequentialMergeSortKernel
parallelMergeSort
parallelMergeSortKernel
package: sort.studio.merge
source folder: student/src/main/java

Sequential Implementation

The only thing you will need to edit is the kernel method. As this is a sequential implementation, there is no need to call an async or finish. Using the description of merge sort provided to you in the background, try to consider what the base case would be. Then, further consider how to call the method recursively. When the array reaches the base case, you will need to complete the “merging” aspect of merge sort. As mentioned above, there is a method in the utils class that should help with this.

method: sequentialMergeSort Sequential.svg (sequential implementation only)

method: sequentialMergeSortKernel Sequential.svg (sequential implementation only)

Parallel Implementation

This approach should behave much like the sequential implementation, but it should also include async/finish calls. Think carefully about where to place them to maximize concurrency.

When the length of the range has fallen below threshold convert to sorting the array sequentially.

method: parallelMergeSort Parallel.svg (parallel implementation required)

method: parallelMergeSortKernel Parallel.svg (parallel implementation required)

Optional Fun Parallel Combiner

class: ParallelCombiner.java Java.png
methods: parallelCombine
package: sort.fun.merge
source folder: student/src/main/java

You can divide and conquer the combine step in merge sort. The work will remain while the critical path length will be improved from down to .

Sadly, you cannot simply divide both sides down the middle and merge the lower halves together and the upper halves together (as the data will likely not line up perfectly).

Circle-information.svg Tip:Being able to perform operations like merge sort combine in parallel often comes down to the question "How can I figure out where to put the current item?"

You can divide one side (choose the longer of the two) and then binary search for the value in the other side.

Testing Your Solution

Correctness

class: MergeSortTestSuite.java Junit.png
package: sort.studio.merge
source folder: testing/src/test/java

Optional Fun Parallel Combiner Correctness

class: ParallelMergeSortParallelCombinerTestSuite.java Junit.png
package: sort.fun.merge
source folder: testing/src/test/java

Performance

class: SortTiming.java Noun Project stopwatch icon 386232 cc.svg
package: sort
source folder: src/main/java