Difference between revisions of "MapReduce Frameworks Lab"

From CSE231 Wiki
Jump to navigation Jump to search
Line 226: Line 226:
  
 
In this method, you will take in the array of lists you previously created and accumulate the key value pairs in the lists into a newly defined map.  To help deal with this issue, you must make use of the Collector provided to you. More specifically, access the accumulator in the collector by calling the <code>accumulator()</code> method and accept the key/value pair when you add it to the map. You probably noticed that the method must return a map of <K, A>, which differs from the <K, V> generics fed into the method. The framework is designed this way as the data originally fed into the mapping stage can be collected into a mutable container before reaching the finish/reduce stage. In order to access the correct value for the map if the key has no associated value yet, use the supplier associated with the Collector with the <code>supplier()</code> method.
 
In this method, you will take in the array of lists you previously created and accumulate the key value pairs in the lists into a newly defined map.  To help deal with this issue, you must make use of the Collector provided to you. More specifically, access the accumulator in the collector by calling the <code>accumulator()</code> method and accept the key/value pair when you add it to the map. You probably noticed that the method must return a map of <K, A>, which differs from the <K, V> generics fed into the method. The framework is designed this way as the data originally fed into the mapping stage can be collected into a mutable container before reaching the finish/reduce stage. In order to access the correct value for the map if the key has no associated value yet, use the supplier associated with the Collector with the <code>supplier()</code> method.
 
<!--
 
Hint: Look into the <code>compute()</code> method for maps.
 
-->
 
  
 
[[File:Bottlenecked accumulate all.png|400px]] [https://docs.google.com/presentation/d/1Gtpj6_5_8imUMccpwxmfrOxKSED791OKl4lZR6G2Bm4/pub?start=false&loop=false&delayms=3000&slide=id.g7ea283730e_0_165 slide]
 
[[File:Bottlenecked accumulate all.png|400px]] [https://docs.google.com/presentation/d/1Gtpj6_5_8imUMccpwxmfrOxKSED791OKl4lZR6G2Bm4/pub?start=false&loop=false&delayms=3000&slide=id.g7ea283730e_0_165 slide]

Revision as of 08:23, 3 March 2022

Bottleneck MapReduce Framework

Matrix MapReduce Framework


combineAndFinishAll

method: Map<K, R> combineAndFinishAll(Map<K, A>[][] input) Parallel.svg (parallel implementation required)

In this stage, you will take the matrix you just completed and combine all of the separate rows down to one array. Afterward, you will convert this combined array of maps into one final map. This method should run in parallel.

As mentioned previously, you should go directly down the matrix to access the same bucket across the different slices you created in the mapAndAccumulateAll step. For all of the maps in a column, you should go through each entry and combine it down into one row. You will need to make use of the Collector’s finisher again, but you will also need to make use of the combiner. You can access the Collector’s combiner using the combiner() method. Although the combine step differs from the bottlenecked framework, the finish step should mirror what you did previously.

Hint: You can use the provided MultiWrapMap class to return the final row as a valid output. You should also combine before you finish.

Matrix combine finish all.png slide

Testing Your Solution

Correctness

There is a top-level test suite comprised of sub test suites which can be invoked separately when you want to focus on one part of the assignment.

class: FrameworksLabTestSuite.java Junit.png
package: mapreduce.framework.lab
source folder: testing/src/test/java

Bottlenecked

class: BottleneckedFrameworkTestSuite.java Junit.png
package: mapreduce.framework.lab.bottlenecked
source folder: testing/src/test/java

MapAll

class: BottleneckedFrameworkTestSuite.java Junit.png
package: mapreduce.framework.lab.bottlenecked
source folder: testing/src/test/java

AccumulateAll

class: BottleneckedAccumulateAllTestSuite.java Junit.png
package: mapreduce.framework.lab.bottlenecked
source folder: testing/src/test/java

FinishAll

class: BottleneckedFinishAllTestSuite.java Junit.png
package: mapreduce.framework.lab.bottlenecked
source folder: testing/src/test/java

Holistic

class: BottleneckedHolisticTestSuite.java Junit.png
package: mapreduce.framework.lab.bottlenecked
source folder: testing/src/test/java

Matrix

class: MatrixFrameworkTestSuite.java Junit.png
package: mapreduce.framework.lab.matrix
source folder: testing/src/test/java

MapAccumulateAll

class: MatrixMapAccumulateAllTestSuite.java Junit.png
package: mapreduce.framework.lab.matrix
source folder: testing/src/test/java

CombineFinishAll

class: MatrixCombineFinishAllTestSuite.java Junit.png
package: mapreduce.framework.lab.matrix
source folder: testing/src/test/java

Holistic

class: MatrixHolisticTestSuite.java Junit.png
package: mapreduce.framework.lab.matrix
source folder: testing/src/test/java

Rubric

As always, please make sure to cite your work appropriately.

Total points: 100

Bottlenecked framework subtotal: 40

  • Correct mapAll (10)
  • Correct accumulateAll (20)
  • Correct finishAll (10)

Matrix framework subtotal: 60

  • Correct mapAndAccumulateAll (30)
  • Correct combineAndFinishAll (30)

-->