Difference between revisions of "MapReduce Reducer Assignment"
Line 5: | Line 5: | ||
First, we will implement {{ClassicReducerLink}} which will provide the basic MapReduce functionality used by the [https://en.wikipedia.org/wiki/PageRank PageRank] algorithm which made [https://www.google.com/ Google]'s original search engine so effective. | First, we will implement {{ClassicReducerLink}} which will provide the basic MapReduce functionality used by the [https://en.wikipedia.org/wiki/PageRank PageRank] algorithm which made [https://www.google.com/ Google]'s original search engine so effective. | ||
− | Then, we will implement two AccumulatorCombinerReducers which can serve to sum mutable containers of Integers. {{SummingIntClassicReducerLink}} will extend ClassicReducer. {{SummingIntEfficientAccumulatorCombinerReducerLink}} will use a far more efficient mutable container to get the job done. | + | Then, we will implement two AccumulatorCombinerReducers which can serve to sum mutable containers of Integers. Thus, they will pair well with the previous [[#MapReduce_Mapper_Assignment|Mappers exercise]]. |
+ | |||
+ | : {{SummingIntClassicReducerLink}} will extend ClassicReducer. | ||
+ | : {{SummingIntEfficientAccumulatorCombinerReducerLink}} will use a far more efficient mutable container to get the job done. | ||
=Background= | =Background= |
Revision as of 17:08, 23 February 2023
credit for this assignment: Finn Voichick and Dennis Cosgrove
Contents
Motivation
interface AccumulatorCombinerReducer<V,A,R> is fundamental to the MapReduce assignments, providing four of the five methods which allow clients to well use interface MapReduceFramework<E,K,V,A,R>.
First, we will implement abstract class ClassicReducer<V,R> which will provide the basic MapReduce functionality used by the PageRank algorithm which made Google's original search engine so effective.
Then, we will implement two AccumulatorCombinerReducers which can serve to sum mutable containers of Integers. Thus, they will pair well with the previous Mappers exercise.
- class SummingIntClassicReducer will extend ClassicReducer.
- class SummingIntEfficientAccumulatorCombinerReducer will use a far more efficient mutable container to get the job done.
Background
The Java streams framework uses interface Collector<T,A,R> to allow clients to customize the four methods to create, accumulate, combine, and reduce. interface Collector<T,A,R> chose to add an extra level of indirection to allow a hyper-lambdafication style. While there is nothing fundamentally wrong with this approach, it was eventually deemed more trouble than it was worth for CSE 231s. To learn more about Collector and its connection to AccumulatorCombinerReducer, check out this Rosetta Stone.
interface AccumulatorCombinerReducer<V,A,R>
methods
createMutableContainer a.k.a. supplier get
We use createMutableContainer() to create a new mutable container. For classic map reduce this would be a List<V>.
rosetta stone: container = collector.supplier().get()
container = reducer.createMutableContainer()
accumulate a.k.a. accumulator accept
We use accumulate(container,item) to accumulate a value. For classic map reduce this would add an item to a list.
rosetta stone: collector.accumulator().accept(container,item);
reducer.accumulate(container,item)
combine a.k.a. combiner apply
We use combine(containerA,containerB) to combine two accumulators. You may combine containerB into containerA or containerA into containerB. Just return whichever is the combined result.
rosetta stone: collector.combiner().apply(containerA,containerB)
reducer.combine(containerA,containerB)
reduce a.k.a. finisher apply
We use reduce(container) to reduce an accumulator.
rosetta stone: collector.finisher().apply(container)
r = reducer.reduce(container)
Code To Use
Code To Implement
ListAccumulatingReducer
The classic MapReduce Collector will collect all of the emitted values in a List.
class: | ListAccumulatingReducer.java | |
methods: | createMutableContainer accumulate combine |
|
package: | mapreduce.apps.reducer.listaccumulating.exercise | |
source folder: | student/src/main/java |
method: List<V> createMutableContainer()
(sequential implementation only)
method: void accumulate(List<V> container, V item)
(sequential implementation only)
method: List<V> combine(List<V> containerA, List<V> containerB)
(sequential implementation only)
IntSumListAccumulatingReducer
class: | IntSumListAccumulatingReducer.java | |
methods: | reduce | |
package: | mapreduce.apps.reducer.listaccumulating.intsum.exercise | |
source folder: | student/src/main/java |
method: public Integer reduce(List<Integer> container)
(sequential implementation only)
The reduce method is passed a list of integers which it should simply sum up and return.
IntSumEfficientReducer
MapReduce Apps like Word Count offer glaring opportunities to optimize the classic MapReduce append all of the 1s in a List and add them up later. In this section of the studio you will use MutableInt to simply add the values as they come in.
class: | IntSumEfficientReducer.java | |
methods: | createMutableContainer accumulate accumulate reduce |
|
package: | mapreduce.apps.reducer.efficient.intsum.exercise | |
source folder: | student/src/main/java |
method: MutableInt createMutableContainer()
(sequential implementation only)
method: void accumulate(MutableInt container, Integer item)
(sequential implementation only)
method: void combine(MutableInt containerA, MutableInt containerB)
(sequential implementation only)
method: reduce(MutableInt container)
(sequential implementation only)
Testing Your Solution
Note: in order to effectively assess this assignment, the testing leverages the WordCountMapper. Be sure to correctly complete that assignment first.}}
class: | _AccumulatorCombinerReducerTestSuite.java | |
package: | mapreduce.apps | |
source folder: | testing/src/test/java |