Collector Rosetta Stone
The interface Collector<T,A,R> serves the standard Java streams framework for MapReduce-like tasks with added in-memory processing capability a la Apache Spark.
You will use interface AccumulatorCombinerReducer<V,A,R> for our MapReduce assignments which is almost a one-to-one match with interface Collector<T,A,R> but de-ultra-uber-hyper-mega-super-lambdafied.
Contents
CSE 231 Selection: Reducer
public interface Reducer<V, A, R> { A createMutableContainer(); void accumulate(A container, V item); A combine(A containerA, A containerB); R reduce(A container); }
Java Streams Collector
public interface Collector<T, A, R> { // invoke supplier().get() to create a new mutable container Supplier<A> supplier(); // invoke accumulator().accept(container, item) to add item to a container BiConsumer<A, T> accumulator(); // invoke combiner().apply(containerA, containerB) to combine one container into the other BinaryOperator<A> combiner(); // invoke finisher().apply(container) to reduce a container to its final form Function<A, R> finisher(); }
Rosetta Stone
public static <V, A, R> Collector<V, A, R> toCollector(Reducer<V, A, R> reducer) { return new Collector<V, A, R>() { @Override public Supplier<A> supplier() { return () -> reducer.createMutableContainer(); } @Override public BiConsumer<A, V> accumulator() { return (container, item) -> reducer.accumulate(container, item); } @Override public BinaryOperator<A> combiner() { return (a, b) -> reducer.combine(a, b); } @Override public Function<A, R> finisher() { return (container) -> reducer.reduce(container); } @Override public Set<Characteristics> characteristics() { return reducer.collectorCharacteristics(); } }; } public static <V, A, R> Reducer<V, A, R> toReducer(Collector<V, A, R> collector) { return new Reducer<V, A, R>() { @Override public A createMutableContainer() { return collector.supplier().get(); } @Override public void accumulate(A container, V item) { collector.accumulator().accept(container, item); } @Override public A combine(A containerA, A containerB) { return collector.combiner().apply(containerA, containerB); } @Override public R reduce(A container) { return collector.finisher().apply(container); } @Override public Set<Characteristics> collectorCharacteristics() { return collector.characteristics(); } }; }
methods
createMutableContainer a.k.a. supplier get
We use createMutableContainer() to create a new mutable container. For classic map reduce this would be a List<V>.
rosetta stone: container = collector.supplier().get()
container = reducer.createMutableContainer()
accumulate a.k.a. accumulator accept
We use accumulate(container,item) to accumulate a value. For classic map reduce this would add an item to a list.
rosetta stone: collector.accumulator().accept(container,item);
reducer.accumulate(container,item)
combine a.k.a. combiner apply
We use combine(containerA,containerB) to combine two accumulators. You may combine containerB into containerA or containerA into containerB. Just return whichever is the combined result.
rosetta stone: collector.combiner().apply(containerA,containerB)
reducer.combine(containerA,containerB)
reduce a.k.a. finisher apply
We use reduce(container) to reduce an accumulator.
rosetta stone: collector.finisher().apply(container)
r = reducer.reduce(container)