Difference between revisions of "MatrixMultiply"
Line 18: | Line 18: | ||
source: [https://en.wikipedia.org/wiki/Matrix_multiplication#General_definition_of_the_matrix_product Matrix Multiplication on Wikipedia] | source: [https://en.wikipedia.org/wiki/Matrix_multiplication#General_definition_of_the_matrix_product Matrix Multiplication on Wikipedia] | ||
− | == | + | ==SequentialMatrixMultiplier== |
+ | We provide the sequential iterative implementation so you can focus just on becoming familiar with using Habanero's forall loops. | ||
+ | |||
The point of the required portion of this studio is not to struggle with matrix multiplication, but rather to get some experience with the parallel for loop constructs in Habanero. | The point of the required portion of this studio is not to struggle with matrix multiplication, but rather to get some experience with the parallel for loop constructs in Habanero. | ||
Line 147: | Line 149: | ||
What computation can be done in parallel? What computation must be performed sequentially? | What computation can be done in parallel? What computation must be performed sequentially? | ||
− | + | <!-- | |
=Provided Example Implementations= | =Provided Example Implementations= | ||
==SequentialMatrixMultiplier== | ==SequentialMatrixMultiplier== | ||
Line 157: | Line 159: | ||
==Forall2dGroupedMatrixMultiplier== | ==Forall2dGroupedMatrixMultiplier== | ||
Same as above but with forall2d. | Same as above but with forall2d. | ||
− | + | --> | |
=Testing Your Solution= | =Testing Your Solution= | ||
==Correctness== | ==Correctness== |
Revision as of 17:27, 15 February 2018
Contents
Motivation
We gain experience using the parallel for loop constructs.
Background
Matrix multiplication is a simple mathematical operation which we will replicate in this studio. For our purposes, we will only deal with square matrices (same number of rows and columns). However, we will approach this problem with several different parallel constructs and approaches.
For those unfamiliar on how to multiply two matrices, take a look at these overviews:
Math Definition
If is an matrix and is an matrix
for each i=[0..n) and for each j=[0..p)
source: Matrix Multiplication on Wikipedia
SequentialMatrixMultiplier
We provide the sequential iterative implementation so you can focus just on becoming familiar with using Habanero's forall loops.
The point of the required portion of this studio is not to struggle with matrix multiplication, but rather to get some experience with the parallel for loop constructs in Habanero.
Feel free to use the provided sequential implementation in SequentialMatrixMultiplier as a reference:
@Override public double[][] multiply(double[][] a, double[][] b) { double[][] result = MatrixUtils.createMultiplyResultBufferInitializedToZeros(a, b); int n = a.length; int m = a[0].length; int p = b[0].length; for (int i = 0; i < n; i++) { for (int j = 0; j < p; j++) { // NOTE: result is already initialized to 0.0 // result[i][j] = 0.0; for (int k = 0; k < m; k++) { result[i][j] += a[i][k] * b[k][j]; } } } return result; }
Code To Use
Javadocs
- forall(start, endExclusive, body)
- forall(chunked(), start, endExclusive, body)
- forall2d(aMin, aMaxExclusive, bMin, bMaxExclusive, body)
- forall2d(chunked(), aMin, aMaxExclusive, bMin, bMaxExclusive, body)
Habanero Documentation
WARNING: Habanero from Rice is inclusive. So, its loops are specified from 0 to n-1. CSE 231's wrapper is exclusive on max so it is specified 0 to n.
Common Mistakes To Avoid
maxExclusive
CSE 231 is exclusive on max. While we wrapped Habanero we changed forall(0, n-1, body) to forall(0, n, body) for everything.
Code To Implement
There are three methods you will need to implement, all of which are different ways to use parallel for loops to solve the problem. To assist you, the sequential implementation has already been completed for you. We recommend starting from the top and working your way down. There is also an optional recursive implementation and a manual grouping implementation which has been done for you (this is just to demonstrate how chunking works behind the scenes).
ForallForallMatrixMultiplier
In this implementation, you will simply convert the sequential solution into a parallel one using two forall loops.
Forall2dMatrixMultiplier
In this implementation, we will cut down the syntax of the two forall implementation with the use of HJ’s forall2d
method. Functionally, this method serves the purpose of using two forall loops. The input parameters should equal the input parameters of the two separate forall loops. For example,
forall(aMin, aMax, (i) -> { forall(bMin, bMax, (j) -> { doSomething(i,j) }); });
Would appear as
forall2d(aMin, aMax, bMin, bMax, (i, j) -> { doSomething(i,j); });
in the forall2d syntax.
Forall2dChunkedMatrixMultiplier
In this implementation, we will add a minor performance boost to the process by using the forall-chunked construct. Although similar to the traditional forall loop, it increases performance using iteration grouping/chunking. This topic is discussed in detail in this Rice video and explained in the HJ documentation. There is no need to specify anything, allow the runtime to determine the chunking.
NOTE: we contemplated also assigning building a 1D forall chunked version. We deemed this more work that it was worth given that you are already building the 2d version. Just know that forall(chunked(), ...) exists for 1d loops as well.
Use chunking. It is a nice feature.
Optional Fun Divide and Conquer Solutions
In this implementation, you will solve the matrix multiply problem sequentially and in parallel using recursion. Although this class should be able to take in a matrix of any size, try to imagine this as a 2x2 matrix in order to make it easier to solve. Once you solve the sequential method, the parallel method should look very similar with exception of an async/finish block.
In order to obtain the desired result matrix, you will need to recursively call the correct submatrices for each of the four result submatrices. Imagining this as a 2x2 matrix, remember that the dot products of the rows of the first matrix and the columns of the second matrix create the result matrix.
Hint: Each result submatrix should have two recursive calls, for a total of eight recursive calls.
SequentialDivideAndConquerMatrixMultiplier
In class SubMatrix
, method sequentialDivideAndConquerMultiplyKernel
you will find your base case and the sub matrices prepared for you.
private static void sequentialDivideAndConquerMultiplyKernel(SubMatrix a, SubMatrix b, SubMatrix result) { if( result.size == 1 ) { result.values[result.row][result.col] += a.values[a.row][a.col] * b.values[b.row][b.col]; } else { SubMatrix a11 = a.newSub11(); SubMatrix a12 = a.newSub12(); SubMatrix a21 = a.newSub21(); SubMatrix a22 = a.newSub22(); SubMatrix b11 = b.newSub11(); SubMatrix b12 = b.newSub12(); SubMatrix b21 = b.newSub21(); SubMatrix b22 = b.newSub22(); SubMatrix result11 = result.newSub11(); SubMatrix result12 = result.newSub12(); SubMatrix result21 = result.newSub21(); SubMatrix result22 = result.newSub22(); throw new NotYetImplementedException(); } }
You simply need to make the appropriate recursive calls to compute the result on the right:
source: Wikipedia Parallel Matrix Multiplication
ParallelDivideAndConquerMatrixMultiplier
Again, given the following:
source: Wikipedia Parallel Matrix Multiplication
What computation can be done in parallel? What computation must be performed sequentially?
Testing Your Solution
Correctness
class: | MatrixMultiplyTestSuite.java | |
package: | matrixmultiply.studio | |
source folder: | testing/src/test/java |
Optional Fun Divide And Conquer Matrix Multiply Correctness
class: | MatrixMultiplyTestSuite.java | |
package: | matrixmultiply.fun | |
source folder: | testing/src/test/java |
Performance
class: | MatrixMultiplicationTiming.java | |
package: | matrixmultiply.studio | |
source folder: | src/main/java |