Difference between revisions of "MapReduce Reducer Assignment"
(10 intermediate revisions by the same user not shown) | |||
Line 38: | Line 38: | ||
=Code To Use= | =Code To Use= | ||
+ | {{ListDoc}} | ||
+ | : {{ArrayListDoc}} | ||
+ | : {{LinkedListDoc}} | ||
+ | |||
[https://commons.apache.org/proper/commons-lang/apidocs/org/apache/commons/lang3/mutable/MutableInt.html class MutableInt] | [https://commons.apache.org/proper/commons-lang/apidocs/org/apache/commons/lang3/mutable/MutableInt.html class MutableInt] | ||
=Code To Implement= | =Code To Implement= | ||
− | == | + | ==ClassicReducer== |
− | The classic MapReduce | + | The classic MapReduce approach always uses a List of <code>V</code>s as its mutable container. This abstract class will prove useful in this exercise and in future ones. |
− | {{CodeToImplement| | + | {{CodeToImplement|ClassicReducer|createMutableContainer<br>accumulate<br>combine|mapreduce.apps.reducer.classic.exercise}} |
+ | ===createMutableContainer=== | ||
{{Sequential|List<V> createMutableContainer()}} | {{Sequential|List<V> createMutableContainer()}} | ||
+ | Construct a new instance of a List. [https://docs.oracle.com/javase/8/docs/api/java/util/ArrayList.html ArrayList] and [https://docs.oracle.com/javase/8/docs/api/java/util/LinkedList.html LinkedList] are two classes each of which implement the interface [https://docs.oracle.com/javase/8/docs/api/java/util/List.html List]. | ||
+ | |||
+ | ===accumulate=== | ||
{{Sequential|void accumulate(List<V> container, V item)}} | {{Sequential|void accumulate(List<V> container, V item)}} | ||
− | {{Sequential| | + | Mutate the specified <code>container</code> by [https://docs.oracle.com/javase/8/docs/api/java/util/List.html#add-E- adding] the specified <code>item</code>. |
+ | |||
+ | ===combine=== | ||
+ | {{Sequential|void combine(List<V> containerA, List<V> containerB)}} | ||
+ | |||
+ | Mutate the specified <code>containerA</code> by [https://docs.oracle.com/javase/8/docs/api/java/util/List.html#addAll-java.util.Collection- adding all] of the contents of the specified <code>containerB</code>. Do NOT mutated containerB. | ||
− | == | + | ==Classic Summing Int== |
− | {{CodeToImplement| | + | {{CodeToImplement|SummingIntClassicReducer|reduce|mapreduce.apps.reducer.summingint.classic.exercise}} |
{{Sequential|public Integer reduce(List<Integer> container)}} | {{Sequential|public Integer reduce(List<Integer> container)}} | ||
Line 59: | Line 72: | ||
The reduce method is passed a list of integers which it should simply sum up and return. | The reduce method is passed a list of integers which it should simply sum up and return. | ||
− | == | + | ==Efficient Summing Int== |
− | MapReduce Apps like Word Count offer glaring opportunities to optimize the classic MapReduce append all of the 1s in a List and add them up later. | + | MapReduce Apps like Word Count offer glaring opportunities to optimize the classic MapReduce append all of the 1s in a List and add them up later. SummingIntEfficientAccumulatorCombinerReducer will use [https://commons.apache.org/proper/commons-lang/apidocs/org/apache/commons/lang3/mutable/MutableInt.html MutableInt] to simply add the values as they come in. |
− | {{CodeToImplement| | + | {{CodeToImplement|SummingIntEfficientAccumulatorCombinerReducer|createMutableContainer<br>accumulate<br>combine<br>reduce|mapreduce.apps.reducer.summingint.efficient.exercise}} |
{{Sequential|MutableInt createMutableContainer()}} | {{Sequential|MutableInt createMutableContainer()}} |
Latest revision as of 21:16, 1 April 2024
credit for this assignment: Finn Voichick and Dennis Cosgrove
Contents
Motivation
interface AccumulatorCombinerReducer<V,A,R> is fundamental to the MapReduce assignments, providing four of the five methods which allow clients to well use interface MapReduceFramework<E,K,V,A,R>.
First, we will implement abstract class ClassicReducer<V,R> which will provide the basic MapReduce functionality used by the PageRank algorithm which made Google's original search engine so effective.
Then, we will implement two AccumulatorCombinerReducers which can serve to sum mutable containers of Integers. Thus, they will pair well with the previous Mappers exercise.
- class SummingIntClassicReducer will extend ClassicReducer.
- class SummingIntEfficientAccumulatorCombinerReducer will use a far more efficient mutable container to get the job done.
Background
The Java streams framework uses interface Collector<T,A,R> to allow clients to customize the four methods to create, accumulate, combine, and reduce. interface Collector<T,A,R> chose to add an extra level of indirection to allow a hyper-lambdafication style. While there is nothing fundamentally wrong with this approach, it was eventually deemed more trouble than it was worth for CSE 231s. To learn more about Collector and its connection to AccumulatorCombinerReducer, check out this Rosetta Stone.
interface AccumulatorCombinerReducer<V,A,R>
AccumulatorCombinerReducer<V, A, R> |
---|
public interface AccumulatorCombinerReducer<V, A, R> {
A createMutableContainer();
void accumulate(A container, V item);
void combine(A containerA, A containerB);
R reduce(A container);
}
|
createMutableContainer()
We use createMutableContainer() to create a new mutable container. For classic map reduce this would be a List<V>.
accumulate(container,item)
We use accumulate(container,item) to accumulate a value. For classic map reduce this would add an item to a list.
combine(containerA,containerB)
We use combine(containerA,containerB) to combine two accumulated mutable containers. You must combine containerB into containerA.
reduce(container)
We use reduce(container) to produce the distilled value of the container.
Code To Use
Code To Implement
ClassicReducer
The classic MapReduce approach always uses a List of V
s as its mutable container. This abstract class will prove useful in this exercise and in future ones.
class: | ClassicReducer.java | |
methods: | createMutableContainer accumulate combine |
|
package: | mapreduce.apps.reducer.classic.exercise | |
source folder: | student/src/main/java |
createMutableContainer
method: List<V> createMutableContainer()
(sequential implementation only)
Construct a new instance of a List. ArrayList and LinkedList are two classes each of which implement the interface List.
accumulate
method: void accumulate(List<V> container, V item)
(sequential implementation only)
Mutate the specified container
by adding the specified item
.
combine
method: void combine(List<V> containerA, List<V> containerB)
(sequential implementation only)
Mutate the specified containerA
by adding all of the contents of the specified containerB
. Do NOT mutated containerB.
Classic Summing Int
class: | SummingIntClassicReducer.java | |
methods: | reduce | |
package: | mapreduce.apps.reducer.summingint.classic.exercise | |
source folder: | student/src/main/java |
method: public Integer reduce(List<Integer> container)
(sequential implementation only)
The reduce method is passed a list of integers which it should simply sum up and return.
Efficient Summing Int
MapReduce Apps like Word Count offer glaring opportunities to optimize the classic MapReduce append all of the 1s in a List and add them up later. SummingIntEfficientAccumulatorCombinerReducer will use MutableInt to simply add the values as they come in.
method: MutableInt createMutableContainer()
(sequential implementation only)
method: void accumulate(MutableInt container, Integer item)
(sequential implementation only)
method: void combine(MutableInt containerA, MutableInt containerB)
(sequential implementation only)
method: reduce(MutableInt container)
(sequential implementation only)
Testing Your Solution
Note: in order to effectively assess this assignment, the testing leverages the WordCountMapper. Be sure to correctly complete that assignment first.
class: | _AccumulatorCombinerReducerTestSuite.java | |
package: | mapreduce.apps | |
source folder: | testing/src/test/java |