Revision as of 11:18, 30 January 2022

Motivation

We gain experience using the parallel for loop constructs.

Background

Matrix multiplication is a simple mathematical operation which we will replicate in this studio. For our purposes, we will only deal with square matrices (same number of rows and columns). However, we will approach this problem with several different parallel constructs and approaches.

For those unfamiliar on how to multiply two matrices, take a look at these overviews:

If $\mathbf {A}$ is an $n\times m$ matrix and $\mathbf {B}$ is an $m\times p$ matrix

for each i=[0..n) and for each j=[0..p)

(\mathbf {A} \mathbf {B} )_{ij}=\sum _{k=1}^{m}A_{ik}B_{kj}

\mathbf {C} ={\begin{pmatrix}a_{11}b_{11}+\cdots +a_{1n}b_{n1}&a_{11}b_{12}+\cdots +a_{1n}b_{n2}&\cdots &a_{11}b_{1p}+\cdots +a_{1n}b_{np}\\a_{21}b_{11}+\cdots +a_{2n}b_{n1}&a_{21}b_{12}+\cdots +a_{2n}b_{n2}&\cdots &a_{21}b_{1p}+\cdots +a_{2n}b_{np}\\\vdots &\vdots &\ddots &\vdots \\a_{m1}b_{11}+\cdots +a_{mn}b_{n1}&a_{m1}b_{12}+\cdots +a_{mn}b_{n2}&\cdots &a_{m1}b_{1p}+\cdots +a_{mn}b_{np}\\\end{pmatrix}}

source: Matrix Multiplication on Wikipedia

Code To Investigate

SequentialMatrixMultiplier

class:	SequentialMatrixMultiplier.java	DEMO:
methods:	multiply
package:	matrixmultiply.demo
source folder:	src//java

SequentialMatrixMultiplierClient

class:	SequentialMatrixMultiplierClient.java	DEMO:
methods:	main
package:	matrixmultiply.client
source folder:	src//java

Demo

The Core Questions

What are the tasks?
What is the data?
Is the data mutable?
If so, how is it shared?

Code To Use

Wiki's reference page

V5

forall(start, endExclusive, body)

forall(chunked(), start, endExclusive, body)

forall2d(aMin, aMaxExclusive, bMin, bMaxExclusive, body)

forall2d(chunked(), aMin, aMaxExclusive, bMin, bMaxExclusive, body)

chunked()

chunked(size)

Common Mistakes To Avoid

Warning:CSE 231 is exclusive on max. While we implemented our own X10/Habanero we changed forall(0, n-1, body) to forall(0, n, body) for everything.

Code To Implement

There are three methods you will need to implement, all of which are different ways to use parallel for loops to solve the problem. To assist you, the sequential implementation has already been completed for you. We recommend starting from the top and working your way down. There is also an optional recursive implementation and a manual grouping implementation which has been done for you (this is just to demonstrate how chunking works behind the scenes).

ForallForallMatrixMultiplier

class:	ForallForallMatrixMultiplier.java
methods:	multiply
package:	matrixmultiply.studio
source folder:	student/src/main/java

method: public double[][] multiply(double[][] a, double[][] b) (parallel implementation required)

In this implementation, you will simply convert the sequential solution into a parallel one using two forall loops.

Forall2dMatrixMultiplier

class:	Forall2dMatrixMultiplier.java
methods:	multiply
package:	matrixmultiply.studio
source folder:	student/src/main/java

method: public double[][] multiply(double[][] a, double[][] b) (parallel implementation required)

In this implementation, we will cut down the syntax of the two forall implementation with the use of V5’s forall2d method. Functionally, this method serves the purpose of using two forall loops. Take a look at the reference page if you have questions on how to utilize this loop.

Forall2dChunkedMatrixMultiplier

class:	Forall2dChunkedMatrixMultiplier.java
methods:	multiply
package:	matrixmultiply.studio
source folder:	student/src/main/java

method: public double[][] multiply(double[][] a, double[][] b) (parallel implementation required)

In this implementation, we will add a minor performance boost to the process by using the forall-chunked construct. Although similar to the traditional forall loop, it increases performance using iteration grouping/chunking. This topic is discussed in detail in this Rice video and explained in the V5 documentation. There is no need to specify anything, allow the runtime to determine the chunking.

NOTE: we contemplated also assigning building a 1D forall chunked version. We deemed this more work that it was worth given that you are already building the 2d version. Just know that forall(chunked(), ...) exists for 1d loops as well.

Use chunking. It is a nice feature.

Optional Divide and Conquer Challenges

In this implementation, you will solve the matrix multiply problem sequentially and in parallel using recursion. Although this class should be able to take in a matrix of any size, try to imagine this as a 2x2 matrix in order to make it easier to solve. Once you solve the sequential method, the parallel method should look very similar with exception of an async/finish block.

In order to obtain the desired result matrix, you will need to recursively call the correct submatrices for each of the four result submatrices. Imagining this as a 2x2 matrix, remember that the dot products of the rows of the first matrix and the columns of the second matrix create the result matrix.

Hint: Each result submatrix should have two recursive calls, for a total of eight recursive calls.

OffsetSubMatrix

class:	OffsetSubMatrix.java
methods:	sequentialMultiply parallelMultiply
package:	matrixmultiply.challenge
source folder:	student/src/main/java

sequentialMultiply(a, b)

method: void sequentialMultiply(OffsetSubMatrix a, OffsetSubMatrix b) (sequential implementation only)

In class OffsetSubMatrix, method sequentialMultiply you will find your base case and the sub matrices prepared for you.

	void sequentialMultiply(OffsetSubMatrix a, OffsetSubMatrix b) {
		if (size == 1) {
			values[row][col] += (a.values[a.row][a.col] * b.values[b.row][b.col]);
		} else {
			OffsetSubMatrix result11 = sub11();
			OffsetSubMatrix result12 = sub12();
			OffsetSubMatrix result21 = sub21();
			OffsetSubMatrix result22 = sub22();

			OffsetSubMatrix a11 = a.sub11();
			OffsetSubMatrix a12 = a.sub12();
			OffsetSubMatrix a21 = a.sub21();
			OffsetSubMatrix a22 = a.sub22();

			OffsetSubMatrix b11 = b.sub11();
			OffsetSubMatrix b12 = b.sub12();
			OffsetSubMatrix b21 = b.sub21();
			OffsetSubMatrix b22 = b.sub22();

			// https://en.wikipedia.org/wiki/Matrix_multiplication_algorithm#Divide_and_conquer_algorithm
			throw new NotYetImplementedException();

		}
	}

You simply need to make the appropriate recursive calls to compute the result on the right:

{\begin{pmatrix}\mathbf {A} _{11}&\mathbf {A} _{12}\\\mathbf {A} _{21}&\mathbf {A} _{22}\\\end{pmatrix}}{\begin{pmatrix}\mathbf {B} _{11}&\mathbf {B} _{12}\\\mathbf {B} _{21}&\mathbf {B} _{22}\\\end{pmatrix}}={\begin{pmatrix}\mathbf {A} _{11}\mathbf {B} _{11}+\mathbf {A} _{12}\mathbf {B} _{21}&\mathbf {A} _{11}\mathbf {B} _{12}+\mathbf {A} _{12}\mathbf {B} _{22}\\\mathbf {A} _{21}\mathbf {B} _{11}+\mathbf {A} _{22}\mathbf {B} _{21}&\mathbf {A} _{21}\mathbf {B} _{12}+\mathbf {A} _{22}\mathbf {B} _{22}\\\end{pmatrix}}

source: Wikipedia Parallel Matrix Multiplication

parallelMultiply(a, b, isParallelPredicate)

method: void parallelMultiply(OffsetSubMatrix a, OffsetSubMatrix b, IntPredicate isParallelPredicate) (parallel implementation required)

Again, given the following:

{\begin{pmatrix}\mathbf {A} _{11}\mathbf {B} _{11}+\mathbf {A} _{12}\mathbf {B} _{21}&\mathbf {A} _{11}\mathbf {B} _{12}+\mathbf {A} _{12}\mathbf {B} _{22}\\\mathbf {A} _{21}\mathbf {B} _{11}+\mathbf {A} _{22}\mathbf {B} _{21}&\mathbf {A} _{21}\mathbf {B} _{12}+\mathbf {A} _{22}\mathbf {B} _{22}\\\end{pmatrix}}

source: Wikipedia Parallel Matrix Multiplication

What computation can be done in parallel? What computation must be performed sequentially?

Warning: The resulting data (this.values) is mutated and shared.

void parallelMultiply(OffsetSubMatrix a, OffsetSubMatrix b, IntPredicate isParallelPredicate)
			throws InterruptedException, ExecutionException {
		if (size > 1 && isParallelPredicate.test(size)) {
			OffsetSubMatrix result11 = sub11();
			OffsetSubMatrix result12 = sub12();
			OffsetSubMatrix result21 = sub21();
			OffsetSubMatrix result22 = sub22();

			OffsetSubMatrix a11 = a.sub11();
			OffsetSubMatrix a12 = a.sub12();
			OffsetSubMatrix a21 = a.sub21();
			OffsetSubMatrix a22 = a.sub22();

			OffsetSubMatrix b11 = b.sub11();
			OffsetSubMatrix b12 = b.sub12();
			OffsetSubMatrix b21 = b.sub21();
			OffsetSubMatrix b22 = b.sub22();

			throw new NotYetImplementedException();

		} else {
			sequentialMultiply(a, b);
		}
	}

Testing Your Solution

Correctness

class:	MatrixMultiplyTestSuite.java
package:	matrixmultiply.studio
source folder:	testing/src/test/java

Optional Fun Divide And Conquer Matrix Multiply Correctness

class:	DivideAndConquerMatrixMultiplyTestSuite.java
package:	matrixmultiply.challenge
source folder:	testing/src/test/java

Performance

class:	MatrixMultiplicationTiming.java
package:	matrixmultiply.studio
source folder:	src/main/java

@@ Line 30: / Line 30: @@
 ==SequentialMatrixMultiplier==
 {{CodeToInvestigate|SequentialMatrixMultiplier|multiply|matrixmultiply.demo}}
+==SequentialMatrixMultiplierClient==
+{{CodeToInvestigate|SequentialMatrixMultiplierClient|main|matrixmultiply.client}}
 = Demo =

Difference between revisions of "MatrixMultiply"

Revision as of 11:18, 30 January 2022

Contents

Motivation

Background

Code To Investigate

SequentialMatrixMultiplier

SequentialMatrixMultiplierClient

Demo

The Core Questions

Code To Use

Common Mistakes To Avoid

Code To Implement

ForallForallMatrixMultiplier

Forall2dMatrixMultiplier

Forall2dChunkedMatrixMultiplier

Optional Divide and Conquer Challenges

OffsetSubMatrix

sequentialMultiply(a, b)

parallelMultiply(a, b, isParallelPredicate)

Testing Your Solution

Correctness

Optional Fun Divide And Conquer Matrix Multiply Correctness

Performance

Navigation menu

Personal tools

Namespaces

Variants

Views

More

Search

Navigation

General

Exercises and Warmups

Tools