From CSE231 Wiki
Revision as of 05:24, 3 March 2017 by Kerryli (talk)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to: navigation, search

Generics are a (almost!) purely compile-time feature added to Java 5. While they don't enable us to do anything we couldn't do before, they definitely programming in Java much easier. Hopefully this page provides enough of an explanation about generics to get you through the MapReduce assignment, but as always, you should get a more in-depth explanation if you are interested. This article is essentially an abridged version of the Wikipedia article, but there are some things they do not cover.


Consider the following block of code I have shamelessly stolen from Wikipedia:

List v = new ArrayList();
Integer i = (Integer)v.get(0); // Run time error

We would like to use this List to store Strings, but forget this fact a line later and mistakenly try to retrieve an Integer. The code actually compiles, but we will get a java.lang.ClassCastException at runtime on the third line when we try to cast the String object to an Integer (Integer is obviously not a subtype of String).

Now, the next piece of code shamelessly stolen (with slight modification):

List<String> v = new ArrayList<String>();
Integer i = v.get(0); // (type error)  compilation-time error

Functionally the same code, but instead of blowing up when we try to run it, we get a compiler error when we try to use an Integer pointer to point to a String. This is because the compiler knows that v.get(int) should return a String, instead of any Object. This is incredibly useful when we want to write safer code that can be reused for many types.


Not only do generics achieve better type safety, allow us to shorten our code as well. Consider again a modified version of the first piece of code:

List v = new ArrayList();
String s = (String)v.get(0); // even though there is no error, we still have to cast!

Because List.get() returns an Object, we still need to manually downcast the result to a String if we want to use the methods and data of the String class. The Java compiler actually compiles code using generics very similarly to the code above for backwards compatibility between JVM versions.

You may also see expressions such as new ArrayList<>() on the right-hand side of an assignment. The type parameter will be inferred from the pointer, and this is valid because Java generics are invariant].

As it relates to the MapReduce Assignment

In your MapReduce assignment, you will come across code that looks something like this:

public interface IndividualMapContext<K, V> extends MapContext<K, V> { ... }

public final class StudentIndividualMapContext<K, V> implements IndividualMapContext<K, V> { ... }

Here, <K, V> declares two generic type parameters, and represent the names of two classes. While they could have been named anything, we used K and V because these classes represent the types of the keys and values emitted from the Mapper. For the rest of the class/interface definition, K and V are these types, and when we use this interface elsewhere, we will provide classes to serve as these parameters. For example, in WordCount, we want to count the occurrences of each word - when we see a word, we want to emit a key-value pair with the word as the key and a 1 as the value. Therefore, a logical solution would be to use a MapContext<String, Integer>.

Further, these type parameters are also used to parameterize the classes/interfaces from which these two classes derive (the extends MapContext<K, V> part: the parent class is also generic). This may seem confusing at first, but an answer on StackOverflow explains it nicely.

Bounded vs. Unbounded

In the above examples, K and V are known as unbounded type parameters. This means that we can use any class to parameterize their pointer and object declarations, and all of the following declarations are valid:

IndividualMapContext<String, Integer> i= ...
IndividualMapContext<Object, Object> i = ...
IndividualMapContext<MyClass1, MyClass2> i = ...

However, this does not mean that we can use a IndividualMapContext<Object, Object> to point to a IndividualMapContext<String, Integer>, again, because generics in Java are invariant.

On the other hand, you may see another type declaration that looks like this public class SomeClass<T extends Comparable<T>> { ... }. In this case, T would be called a bounded type parameter. Inside the class definition, T is still the name of the type, but we now require that when we create an instance of SomeClass, its type parameter must implement Comparable of itself. Notice that even though Comparable is an interface, we still use the extends keyword as if it were a class. Imposing this restriction is useful because we now have more information about T - if it implements Comparable<T>, it must have the compareTo(T) method. Consider now the following declarations:

SomeClass<Integer> s = ... // valid because Integer implements Comparable<Integer>
SomeClass<String> s = ... // valid because String implements Comparable<String>
SomeClass<Object> s = ... // invalid because Object does not implement Comparable<Object>
SomeClass<java.sql.Time> s = ... // invalid even though Time implements Comparable<Date>, details below

If we examine the documentation for the Time class, we see that it actually "implements" Comparable<Date> (its parent class implements it, so we don't see an explicit implements Comparable<Date> in the file). However, our original definition requires that T implement Comparable<T>, so it's actually not enough to just implement a parent of T.

While I won't go over this feature in this page, you can declare much more powerful generic classes/interfaces/methods using Wildcards; if you want to learn about these, I would recommend the readings below.