Difference between revisions of "Nucleobase Counting"

From CSE231 Wiki
Jump to navigation Jump to search
Line 25: Line 25:
  
 
===Related Videos===
 
===Related Videos===
 +
<youtube>ShcnjoO89ZU</youtube>
 +
<!--
 +
<youtube></youtube>
 +
 +
<youtube></youtube>
 +
 +
<youtube></youtube>
 +
 +
<youtube></youtube>
 +
 +
<youtube></youtube>
 +
 
* [https://wustl.app.box.com/s/2w4c9xn5du3sqwmmji7v8w35ntpq1j7z Finish & Async: Lower Upper Split]
 
* [https://wustl.app.box.com/s/2w4c9xn5du3sqwmmji7v8w35ntpq1j7z Finish & Async: Lower Upper Split]
* [https://wustl.app.box.com/s/dx0f8ycz82e3r894rnpk0cjw8tj8jqzuDealing With Final: Using Array Slots]
+
* [https://wustl.app.box.com/s/dx0f8ycz82e3r894rnpk0cjw8tj8jqzu Dealing With Final: Using Array Slots]
 
* [https://wustl.app.box.com/s/wix18u57v7x1rdx1d1dgfedf6ibr0n92 Dealing With Final: IntegerRange]
 
* [https://wustl.app.box.com/s/wix18u57v7x1rdx1d1dgfedf6ibr0n92 Dealing With Final: IntegerRange]
 
* [https://wustl.app.box.com/s/p3ooceekguiaygg4hggw76rbpt28kf6o Coarsening: N-Way Split]
 
* [https://wustl.app.box.com/s/p3ooceekguiaygg4hggw76rbpt28kf6o Coarsening: N-Way Split]
 
* [https://wustl.app.box.com/s/smckzu78cvg8seocgluuo1afydgd0gmt Finish & Async Coarsening: N-Way Split]
 
* [https://wustl.app.box.com/s/smckzu78cvg8seocgluuo1afydgd0gmt Finish & Async Coarsening: N-Way Split]
 
* [https://wustl.app.box.com/s/wqrauxpsz8xskt897qyukm6rkx9u4zmn Divide and Conquer: Array Sum Example]
 
* [https://wustl.app.box.com/s/wqrauxpsz8xskt897qyukm6rkx9u4zmn Divide and Conquer: Array Sum Example]
 +
->
  
 
=Mistakes to Avoid=
 
=Mistakes to Avoid=

Revision as of 16:41, 9 February 2018

Motivation

  • get our feet wet with parallel programming
  • gain experience with the async and finish constructs
  • take different approaches to splitting up the work

We will solve the problem sequentially, then split the work up into 2 tasks, then coarsen the work n-ways, and finally split up the work in a divide-and-conquer recursive style.

Background

Bioinformatics

For this assignment, you will be writing sequential and parallel code to count nucleobases in a human X chromosome.

DNA is made up of four nucleobases: cytosine, guanine, adenine, and thymine. A strand of DNA can thus be represented as a string of letters representing these nucleobases, for example: “ACCGCATAAAGTCC.” However, DNA sequencing is typically not 100% accurate, so some of the nucleobases are not read with high certainty. These bases can be represented as an “N.” A sequence then might look something like “NCCGCATNAAGTCC.” Your goal is to write code that counts the number of occurrences a particular nucleobase or uncertain reads.

We will be using actual data pulled from the US National Library of Medicine, a database maintained by the National Institute of Health. We have already provided you the code that you need to access the chromosome from the database and check your work. You must implement a sequential solution and three parallel solutions to count the given bases in these sequences.

For some more optional background on DNA and nucleotide bases, please check out

Parallel Programming

Parallel Constructs

Related Videos