Difference between revisions of "MapReduce Reducer Assignment"

From CSE231 Wiki
Jump to navigation Jump to search
Line 1: Line 1:
 
credit for this assignment: Finn Voichick and Dennis Cosgrove
 
credit for this assignment: Finn Voichick and Dennis Cosgrove
 
=Motivation=
 
=Motivation=
<code>interface Reducer<T,A,R></code> is fundamental to the MapReduce Frameworks lab.  Your frameworks will be general enough such that MapReduce is just a subset of what you will support.
+
{{AccumulatorCombinerReducerLink}} is fundamental to the MapReduce assignments, providing four of the five methods which allow clients to well use {{MapReduceFrameworkLink}}.
  
In this studio we will build a ListAccumulatingReducer which will implement the MapReduce style of accumulating all of the emitted values per key in a List.
+
First, we will implement {{ClassicReducerLink}} which will provide the basic MapReduce functionality used by the [https://en.wikipedia.org/wiki/PageRank PageRank] algorithm which made [https://www.google.com/ Google]'s original search engine so effective.
  
We will also build a custom IntSumEfficientReducer to demonstrate the desired flexibility of the Reducer interface.
+
Then, we will implement two AccumulatorCombinerReducers which can serve to sum mutable containers of Integers.  {{SummingIntClassicReducerLink}} will extend ClassicReducer.  {{SummingIntEfficientAccumulatorCombinerReducerLink}} will use a far more efficient mutable container to get the job done.
  
 
=Background=
 
=Background=
We have chosen to use a non-lambdafied Reducer<V,A,R> interface versus adopting the standard [https://docs.oracle.com/javase/8/docs/api/java/util/stream/Collector.html interface Collector<T,A,R>] from the standard [http://www.oracle.com/technetwork/articles/java/ma14-java-se-8-streams-2177646.html Java streams framework] as the basis for the MapReduce Frameworks Lab. 
+
The [http://www.oracle.com/technetwork/articles/java/ma14-java-se-8-streams-2177646.html Java streams framework] uses {{CollectorLink}} to allow clients to customize the four methods to create, accumulate, combine, and reduce. {{CollectorLink}} chose to add an extra level of indirection to allow a hyper-lambdafication style. While there is nothing fundamentally wrong with this approach, it was eventually deemed more trouble than it was worth for CSE 231s. To learn more about Collector and its connection to AccumulatorCombinerReducer, check out this [[#Collector_Rosetta_Stone|Rosetta Stone]].
 
 
== CSE 231 Selection: Reducer ==
 
<nowiki>public interface Reducer<V, A, R> {
 
A createMutableContainer();
 
void accumulate(A container, V item);
 
A combine(A containerA, A containerB);
 
R reduce(A container);
 
}</nowiki>
 
 
 
== Java Streams Collector ==
 
  <nowiki>public interface Collector<T, A, R> {
 
 
 
// invoke supplier().get() to create a new mutable container
 
Supplier<A> supplier();
 
 
 
// invoke accumulator().accept(container, item) to add item to a container
 
BiConsumer<A, T> accumulator();
 
 
 
// invoke combiner().apply(containerA, containerB) to combine one container into the other
 
BinaryOperator<A> combiner();
 
 
 
// invoke finisher().apply(container) to reduce a container to its final form
 
Function<A, R> finisher();
 
}</nowiki>
 
 
 
[https://docs.oracle.com/javase/8/docs/api/java/util/stream/Collector.html interface Collector<T,A,R>]
 
: [https://docs.oracle.com/javase/8/docs/api/java/util/function/Supplier.html interface Supplier<T>]
 
: [https://docs.oracle.com/javase/8/docs/api/java/util/function/BiConsumer.html interface BiConsumer<T,U>]
 
: [https://docs.oracle.com/javase/8/docs/api/java/util/function/BinaryOperator.html interface BinaryOperator<T>]
 
: [https://docs.oracle.com/javase/8/docs/api/java/util/function/Function.html interface Function<T,R>]
 
 
 
== Rosetta Stone ==
 
 
 
<nowiki> public static <V, A, R> Collector<V, A, R> toCollector(Reducer<V, A, R> reducer) {
 
return new Collector<V, A, R>() {
 
@Override
 
public Supplier<A> supplier() {
 
return () -> reducer.createMutableContainer();
 
}
 
 
 
@Override
 
public BiConsumer<A, V> accumulator() {
 
return (container, item) -> reducer.accumulate(container, item);
 
}
 
 
 
@Override
 
public BinaryOperator<A> combiner() {
 
return (a, b) -> reducer.combine(a, b);
 
}
 
 
 
@Override
 
public Function<A, R> finisher() {
 
return (container) -> reducer.reduce(container);
 
}
 
 
 
@Override
 
public Set<Characteristics> characteristics() {
 
return reducer.collectorCharacteristics();
 
}
 
};
 
}
 
 
 
public static <V, A, R> Reducer<V, A, R> toReducer(Collector<V, A, R> collector) {
 
return new Reducer<V, A, R>() {
 
@Override
 
public A createMutableContainer() {
 
return collector.supplier().get();
 
}
 
 
 
@Override
 
public void accumulate(A container, V item) {
 
collector.accumulator().accept(container, item);
 
}
 
 
 
@Override
 
public A combine(A containerA, A containerB) {
 
return collector.combiner().apply(containerA, containerB);
 
}
 
 
 
@Override
 
public R reduce(A container) {
 
return collector.finisher().apply(container);
 
}
 
 
 
@Override
 
public Set<Characteristics> collectorCharacteristics() {
 
return collector.characteristics();
 
}
 
};
 
}
 
</nowiki>
 
  
 +
={{AccumulatorCombinerReducerLink}}=
 
==methods==
 
==methods==
 
=== createMutableContainer a.k.a. supplier get===
 
=== createMutableContainer a.k.a. supplier get===
Line 153: Line 63:
 
{{Sequential|void accumulate(MutableInt container, Integer item)}}
 
{{Sequential|void accumulate(MutableInt container, Integer item)}}
  
{{Sequential|MutableInt combine(MutableInt containerA, MutableInt containerB)}}
+
{{Sequential|void combine(MutableInt containerA, MutableInt containerB)}}
  
 
{{Sequential|reduce(MutableInt container)}}
 
{{Sequential|reduce(MutableInt container)}}
  
 
=Testing Your Solution=
 
=Testing Your Solution=
==Correctness==
+
'''Note:''' in order to effectively assess this assignment, the testing leverages the WordCountMapper. Be sure to correctly complete that assignment first.}}
===top level===
 
{{TestSuite|_ReducerExerciseTestSuite|mapreduce}}
 
===sub===
 
{{TestSuite|_ListAccumulatingReducerTestSuite|mapreduce}}
 
 
 
{{TestSuite|_IntSumReduceOnlyTestSuite|mapreduce.apps.reducer.listaccumulating.intsum.exercise}}
 
  
{{TestSuite|_EfficientIntSumTestSuite|mapreduce}}
+
{{TestSuite|_AccumulatorCombinerReducerTestSuite|mapreduce.apps}}

Revision as of 16:58, 23 February 2023

credit for this assignment: Finn Voichick and Dennis Cosgrove

Motivation

interface AccumulatorCombinerReducer<V,A,R> is fundamental to the MapReduce assignments, providing four of the five methods which allow clients to well use interface MapReduceFramework<E,K,V,A,R>.

First, we will implement abstract class ClassicReducer<V,R> which will provide the basic MapReduce functionality used by the PageRank algorithm which made Google's original search engine so effective.

Then, we will implement two AccumulatorCombinerReducers which can serve to sum mutable containers of Integers. class SummingIntClassicReducer will extend ClassicReducer. class SummingIntEfficientAccumulatorCombinerReducer will use a far more efficient mutable container to get the job done.

Background

The Java streams framework uses interface Collector<T,A,R> to allow clients to customize the four methods to create, accumulate, combine, and reduce. interface Collector<T,A,R> chose to add an extra level of indirection to allow a hyper-lambdafication style. While there is nothing fundamentally wrong with this approach, it was eventually deemed more trouble than it was worth for CSE 231s. To learn more about Collector and its connection to AccumulatorCombinerReducer, check out this Rosetta Stone.

interface AccumulatorCombinerReducer<V,A,R>

methods

createMutableContainer a.k.a. supplier get

We use createMutableContainer() to create a new mutable container. For classic map reduce this would be a List<V>.

rosetta stone: container = collector.supplier().get() container = reducer.createMutableContainer()

accumulate a.k.a. accumulator accept

We use accumulate(container,item) to accumulate a value. For classic map reduce this would add an item to a list.

rosetta stone: collector.accumulator().accept(container,item); reducer.accumulate(container,item)

combine a.k.a. combiner apply

We use combine(containerA,containerB) to combine two accumulators. You may combine containerB into containerA or containerA into containerB. Just return whichever is the combined result.

rosetta stone: collector.combiner().apply(containerA,containerB) reducer.combine(containerA,containerB)

reduce a.k.a. finisher apply

We use reduce(container) to reduce an accumulator.

rosetta stone: collector.finisher().apply(container) r = reducer.reduce(container)

Code To Use

class MutableInt

Code To Implement

ListAccumulatingReducer

The classic MapReduce Collector will collect all of the emitted values in a List.

class: ListAccumulatingReducer.java Java.png
methods: createMutableContainer
accumulate
combine
package: mapreduce.apps.reducer.listaccumulating.exercise
source folder: student/src/main/java

method: List<V> createMutableContainer() Sequential.svg (sequential implementation only)

method: void accumulate(List<V> container, V item) Sequential.svg (sequential implementation only)

method: List<V> combine(List<V> containerA, List<V> containerB) Sequential.svg (sequential implementation only)

IntSumListAccumulatingReducer

class: IntSumListAccumulatingReducer.java Java.png
methods: reduce
package: mapreduce.apps.reducer.listaccumulating.intsum.exercise
source folder: student/src/main/java

method: public Integer reduce(List<Integer> container) Sequential.svg (sequential implementation only)

The reduce method is passed a list of integers which it should simply sum up and return.

IntSumEfficientReducer

MapReduce Apps like Word Count offer glaring opportunities to optimize the classic MapReduce append all of the 1s in a List and add them up later. In this section of the studio you will use MutableInt to simply add the values as they come in.

class: IntSumEfficientReducer.java Java.png
methods: createMutableContainer
accumulate
accumulate
reduce
package: mapreduce.apps.reducer.efficient.intsum.exercise
source folder: student/src/main/java

method: MutableInt createMutableContainer() Sequential.svg (sequential implementation only)

method: void accumulate(MutableInt container, Integer item) Sequential.svg (sequential implementation only)

method: void combine(MutableInt containerA, MutableInt containerB) Sequential.svg (sequential implementation only)

method: reduce(MutableInt container) Sequential.svg (sequential implementation only)

Testing Your Solution

Note: in order to effectively assess this assignment, the testing leverages the WordCountMapper. Be sure to correctly complete that assignment first.}}

class: _AccumulatorCombinerReducerTestSuite.java Junit.png
package: mapreduce.apps
source folder: testing/src/test/java