String Map K Mer Assignment

From CSE231 Wiki
Revision as of 07:26, 15 November 2022 by Cosgroved (talk | contribs)
Jump to navigation Jump to search

Group Assignment

This is a group assignment.

Code To Investigate

Java Util Concurrent

ConcurrentHashMap

StringKMers

The following method should be useful as you build the assignment.

String toString(byte[] sequence, int offset, int kMerLength)

	/**
	 * Stores the information from the given sequence into a String. For example, if
	 * you had the sequence, "ACCTGTCAAAA" and you called this method with an offset
	 * of 1 and a k of 4, it would return "CCTG".
	 * 
	 * @param sequence the sequence of nucleobases to draw the bytes from
	 * @param offset   the offset for where to start looking for bytes
	 * @param k        the length of the k-mer to make a String for
	 * @return a String representation of the k-mer at the desired position
	 */
	public static String toString(byte[] sequence, int offset, int k) {
		return new String(sequence, offset, k, StandardCharsets.UTF_8);
	}

KMerResults

interface KMerResults

class StringMapKMerResults

Code To Implement

StringHashMapKMerCounter

class: StringHashMapKMerCounter.java Java.png
methods: parse
package: kmer.group
source folder: student/src/main/java

parse

method: public StringMapKMerResults parse(List<byte[]> sequences, int k) Sequential.svg (sequential implementation only)

In this completely sequential implementation, you will have to write the parse method. The method takes in a list of arrays of bytes and a k-mer length. It should return an instance of StringMapKMerResults(which takes in a map), a class provided to you which does exactly what its name suggests.

parse should go through the amount of possible k-mers for every byte array in the list of sequences. As it goes through the bytes in the array, use the StringKMers.toString(sequence, offset, k) method to create a string to use as a key for the HashMap. The map should take in a String as the key and an Integer as the value. We recommend using the map.compute() method and reviewing how to use lambdas.

StringConcurrentHashMapKMerCounter

class: StringConcurrentHashMapKMerCounter.java Java.png
methods: constructor
concurrentMapFactory
createConcurrentMap
parse
package: kmer.group
source folder: student/src/main/java

Note: this class forces a somewhat ridiculous amount of abstraction when it comes to constructing new ConcurrentMaps. The constructor will be passed a Supplier to be used instead of, for example, constructing a ConcurrentHashMap directly. This allows the testing to catch errors sooner in an effort to aid debugging.

constructor and instance variable

Hang onto the concurrentMapFactory passed to the constructor in an instance variable so you can use it later.

public StringConcurrentMapUnbalancedKMerCounter(Supplier<ConcurrentMap<String, Integer>> concurrentMapFactory)

concurrentMapFactory

Return the concurrentMapFactory passed to the constructor.

createConcurrentMap

Use the concurrentMapFactory's get() method to create a new ConcurrentMap.

parse

method: public StringMapKMerResults parse(List<byte[]> sequences, int k) Parallel.svg (parallel implementation required)

This implementation will make your sequential String HashMap implementation into a parallel one. To do so, you will be making use of Java’s thread-safe version of a HashMap: a ConcurrentHashMap. Like before, you will be need to complete the parse method but this time in parallel.

Videos


Testing Your Solution

class: __StringKMerTestSuite.java Junit.png
package: kmer.group
source folder: testing/src/test/java

sequential

class: _StringSequentialKMerTestSuite.java Junit.png
package: kmer.group
source folder: testing/src/test/java

concurrent

class: _StringConcurrentKMerTestSuite.java Junit.png
package: kmer.group
source folder: testing/src/test/java