Difference between revisions of "MapReduce Mapper Assignment"

From CSE231 Wiki
Jump to navigation Jump to search
Line 60: Line 60:
  
 
{{Sequential|public Function<List<Integer>, Integer> finisher()}}
 
{{Sequential|public Function<List<Integer>, Integer> finisher()}}
 +
 +
The sea of code created by the anonymous inner class can admittedly be a bit intimidating at first.
 +
 +
<nowiki> @Override
 +
public Function<List<Integer>, Integer> finisher() {
 +
return new Function<List<Integer>, Integer>() {
 +
@Override
 +
public Integer apply(List<Integer> list) {
 +
throw new NotYetImplementedException();
 +
}
 +
};
 +
}</nowiki>
  
 
=Testing Your Solution=
 
=Testing Your Solution=
 
==Correctness==
 
==Correctness==
 
{{TestSuite|IntSumStudioTestSuite|mapreduce}}
 
{{TestSuite|IntSumStudioTestSuite|mapreduce}}

Revision as of 17:35, 20 February 2018

Motivation

In previous semesters the MapReduce lab has proven to be the most challenging. We have split things up to allow you to get familiar with the how Mappers and Reducers work first. We will build a card mapper that matches the spec outlined in the prep video, a simple word counting mapper, an analogous k-mer counting mapper, and an integer sum reducer to wrap things up.

The k-mer counting mapper will prepare us for (and hopefully lessen the burden of) upcoming lab 6.

Code To Use

BiConsumer

accept(t,u) (note: closest relative to "emit" from RiceX prep)

Collector

finisher() (note: closest relative to "reduce" from RiceX prep)

Deck (note: Iterable<Card>)

Card

getRank()
getSuit()

Rank

isNumeric()
getNumericValue()

Suit

TextSection

getWords()

Code To Implement

Card Mapper

The specification for this mapper is outlined in the prep video:

Non-numeric cards are considered to be bad data and ignored. Numeric cards should be emitted with their suit as the key and the numeric value as the value.

class: CardMapper.java Java.png
methods: map
package: mapreduce.apps.intsum.cards.studio
source folder: student/src/main/java

method: public void map(Deck deck, BiConsumer<Suit, Integer> keyValuePairConsumer) Sequential.svg (sequential implementation only)

Word Count Mapper

Counting occurrences of words in text is a classic example of mapreduce. We will ignore any zero length words and convert the remaining words to lower-case so as to get a case insensitive count. Emitting each lower-cased word as the key with the value of 1 should do the trick here.

class: WordCountMapper.java Java.png
methods: map
package: mapreduce.apps.intsum.wordcount.studio
source folder: student/src/main/java

method: public void map(TextSection textSection, BiConsumer<String, Integer> keyValuePairConsumer) Sequential.svg (sequential implementation only)

K-mer Count Mapper

K-mer counting is a useful technique in bioinformatics: http://www.csbio.unc.edu/mcmillan/Comp555S17/Lecture02.pdf

Background information on k-mer counting can be found here: https://en.wikipedia.org/wiki/K-mer

This mapper is similar to the word count mapper except that the k-mers overlap with each other while words are separate.

class: KMerMapper.java Java.png
methods: map
package: mapreduce.apps.intsum.kmer.studio
source folder: student/src/main/java

method: public void map(byte[] sequence, BiConsumer<String, Integer> keyValuePairConsumer) Sequential.svg (sequential implementation only)

Int Sum Reducer

class: IntegerSumClassicReducer.java Java.png
methods: finisher
package: mapreduce.apps.intsum.studio
source folder: student/src/main/java

method: public Function<List<Integer>, Integer> finisher() Sequential.svg (sequential implementation only)

The sea of code created by the anonymous inner class can admittedly be a bit intimidating at first.

	@Override
	public Function<List<Integer>, Integer> finisher() {
		return new Function<List<Integer>, Integer>() {
			@Override
			public Integer apply(List<Integer> list) {
				throw new NotYetImplementedException();
			}
		};
	}

Testing Your Solution

Correctness

class: IntSumStudioTestSuite.java Junit.png
package: mapreduce
source folder: testing/src/test/java