Difference between revisions of "Collector Rosetta Stone"

From CSE231 Wiki
Jump to navigation Jump to search
(Created page with "The {{CollectorLink}} serves the standard [http://www.oracle.com/technetwork/articles/java/ma14-java-se-8-streams-2177646.html Java streams framework] for MapReduce-like tasks...")
 
Line 3: Line 3:
 
You will use {{AccumulatorCombinerReducerLink}} for our MapReduce assignments which is almost a one-to-one match with {{CollectorLink}} but de-ultra-uber-hyper-mega-super-lambdafied.   
 
You will use {{AccumulatorCombinerReducerLink}} for our MapReduce assignments which is almost a one-to-one match with {{CollectorLink}} but de-ultra-uber-hyper-mega-super-lambdafied.   
  
== CSE 231 Selection: Reducer ==
+
== CSE 231s: AccumulatorCombinerReducer<V, A, R> ==
<nowiki>public interface Reducer<V, A, R> {
+
<syntaxhighlight lang="java">
 +
public interface AccumulatorCombinerReducer<V, A, R> {
 
A createMutableContainer();
 
A createMutableContainer();
 +
 
void accumulate(A container, V item);
 
void accumulate(A container, V item);
A combine(A containerA, A containerB);
+
 
 +
void combine(A containerA, A containerB);
 +
 
 
R reduce(A container);
 
R reduce(A container);
}</nowiki>
+
 
 +
default Set<Characteristics> collectorCharacteristics() {
 +
return EnumSet.noneOf(Characteristics.class);
 +
}
 +
}
 +
</syntaxhighlight>
 +
 
 +
== Java Streams: Collector<T, A, R> ==
 +
<syntaxhighlight lang="java">
 +
public interface Collector<T, A, R> {
 +
    /**
 +
    * A function that creates and returns a new mutable result container.
 +
    *
 +
    * @return a function which returns a new, mutable result container
 +
    */
 +
    Supplier<A> supplier();
 +
 
 +
    /**
 +
    * A function that folds a value into a mutable result container.
 +
    *
 +
    * @return a function which folds a value into a mutable result container
 +
    */
 +
    BiConsumer<A, T> accumulator();
 +
 
 +
    /**
 +
    * A function that accepts two partial results and merges them.  The
 +
    * combiner function may fold state from one argument into the other and
 +
    * return that, or may return a new result container.
 +
    *
 +
    * @return a function which combines two partial results into a combined
 +
    * result
 +
    */
 +
    BinaryOperator<A> combiner();
 +
 
 +
    /**
 +
    * Perform the final transformation from the intermediate accumulation type
 +
    * {@code A} to the final result type {@code R}.
 +
    *
 +
    * <p>If the characteristic {@code IDENTITY_FINISH} is
 +
    * set, this function may be presumed to be an identity transform with an
 +
    * unchecked cast from {@code A} to {@code R}.
 +
    *
 +
    * @return a function which transforms the intermediate result to the final
 +
    * result
 +
    */
 +
    Function<A, R> finisher();
 +
 
 +
    /**
 +
    * Returns a {@code Set} of {@code Collector.Characteristics} indicating
 +
    * the characteristics of this Collector.  This set should be immutable.
 +
    *
 +
    * @return an immutable set of collector characteristics
 +
    */
 +
    Set<Characteristics> characteristics();
 +
}
 +
</syntaxhighlight>
  
 
== Java Streams Collector ==
 
== Java Streams Collector ==

Revision as of 15:52, 23 February 2023

The interface Collector<T,A,R> serves the standard Java streams framework for MapReduce-like tasks with added in-memory processing capability a la Apache Spark.

You will use interface AccumulatorCombinerReducer<V,A,R> for our MapReduce assignments which is almost a one-to-one match with interface Collector<T,A,R> but de-ultra-uber-hyper-mega-super-lambdafied.

CSE 231s: AccumulatorCombinerReducer<V, A, R>

public interface AccumulatorCombinerReducer<V, A, R> {
	A createMutableContainer();

	void accumulate(A container, V item);

	void combine(A containerA, A containerB);

	R reduce(A container);

	default Set<Characteristics> collectorCharacteristics() {
		return EnumSet.noneOf(Characteristics.class);
	}
}

Java Streams: Collector<T, A, R>

public interface Collector<T, A, R> {
    /**
     * A function that creates and returns a new mutable result container.
     *
     * @return a function which returns a new, mutable result container
     */
    Supplier<A> supplier();

    /**
     * A function that folds a value into a mutable result container.
     *
     * @return a function which folds a value into a mutable result container
     */
    BiConsumer<A, T> accumulator();

    /**
     * A function that accepts two partial results and merges them.  The
     * combiner function may fold state from one argument into the other and
     * return that, or may return a new result container.
     *
     * @return a function which combines two partial results into a combined
     * result
     */
    BinaryOperator<A> combiner();

    /**
     * Perform the final transformation from the intermediate accumulation type
     * {@code A} to the final result type {@code R}.
     *
     * <p>If the characteristic {@code IDENTITY_FINISH} is
     * set, this function may be presumed to be an identity transform with an
     * unchecked cast from {@code A} to {@code R}.
     *
     * @return a function which transforms the intermediate result to the final
     * result
     */
    Function<A, R> finisher();

    /**
     * Returns a {@code Set} of {@code Collector.Characteristics} indicating
     * the characteristics of this Collector.  This set should be immutable.
     *
     * @return an immutable set of collector characteristics
     */
    Set<Characteristics> characteristics();
}

Java Streams Collector

public interface Collector<T, A, R> {

	// invoke supplier().get() to create a new mutable container
	Supplier<A> supplier();

	// invoke accumulator().accept(container, item) to add item to a container
	BiConsumer<A, T> accumulator();

	// invoke combiner().apply(containerA, containerB) to combine one container into the other
	BinaryOperator<A> combiner();

	// invoke finisher().apply(container) to reduce a container to its final form
	Function<A, R> finisher();
}

interface Collector<T,A,R>

interface Supplier<T>
interface BiConsumer<T,U>
interface BinaryOperator<T>
interface Function<T,R>

Rosetta Stone

	public static <V, A, R> Collector<V, A, R> toCollector(Reducer<V, A, R> reducer) {
		return new Collector<V, A, R>() {
			@Override
			public Supplier<A> supplier() {
				return () -> reducer.createMutableContainer();
			}

			@Override
			public BiConsumer<A, V> accumulator() {
				return (container, item) -> reducer.accumulate(container, item);
			}

			@Override
			public BinaryOperator<A> combiner() {
				return (a, b) -> reducer.combine(a, b);
			}

			@Override
			public Function<A, R> finisher() {
				return (container) -> reducer.reduce(container);
			}

			@Override
			public Set<Characteristics> characteristics() {
				return reducer.collectorCharacteristics();
			}
		};
	}

	public static <V, A, R> Reducer<V, A, R> toReducer(Collector<V, A, R> collector) {
		return new Reducer<V, A, R>() {
			@Override
			public A createMutableContainer() {
				return collector.supplier().get();
			}

			@Override
			public void accumulate(A container, V item) {
				collector.accumulator().accept(container, item);
			}

			@Override
			public A combine(A containerA, A containerB) {
				return collector.combiner().apply(containerA, containerB);
			}

			@Override
			public R reduce(A container) {
				return collector.finisher().apply(container);
			}

			@Override
			public Set<Characteristics> collectorCharacteristics() {
				return collector.characteristics();
			}
		};
	}

methods

createMutableContainer a.k.a. supplier get

We use createMutableContainer() to create a new mutable container. For classic map reduce this would be a List<V>.

rosetta stone: container = collector.supplier().get() container = reducer.createMutableContainer()

accumulate a.k.a. accumulator accept

We use accumulate(container,item) to accumulate a value. For classic map reduce this would add an item to a list.

rosetta stone: collector.accumulator().accept(container,item); reducer.accumulate(container,item)

combine a.k.a. combiner apply

We use combine(containerA,containerB) to combine two accumulators. You may combine containerB into containerA or containerA into containerB. Just return whichever is the combined result.

rosetta stone: collector.combiner().apply(containerA,containerB) reducer.combine(containerA,containerB)

reduce a.k.a. finisher apply

We use reduce(container) to reduce an accumulator.

rosetta stone: collector.finisher().apply(container) r = reducer.reduce(container)