Latest revision as of 00:21, 14 February 2023

Motivation

We gain experience using the parallel for loop constructs.

Background

Matrix multiplication is a simple mathematical operation which we will replicate in this studio. For our purposes, we will only deal with square matrices (same number of rows and columns). However, we will approach this problem with several different parallel constructs and approaches.

For those unfamiliar on how to multiply two matrices, take a look at these overviews:

If $\mathbf {A}$ is an $n\times m$ matrix and $\mathbf {B}$ is an $m\times p$ matrix

for each i=[0..n) and for each j=[0..p)

(\mathbf {A} \mathbf {B} )_{ij}=\sum _{k=1}^{m}A_{ik}B_{kj}

\mathbf {C} ={\begin{pmatrix}a_{11}b_{11}+\cdots +a_{1n}b_{n1}&a_{11}b_{12}+\cdots +a_{1n}b_{n2}&\cdots &a_{11}b_{1p}+\cdots +a_{1n}b_{np}\\a_{21}b_{11}+\cdots +a_{2n}b_{n1}&a_{21}b_{12}+\cdots +a_{2n}b_{n2}&\cdots &a_{21}b_{1p}+\cdots +a_{2n}b_{np}\\\vdots &\vdots &\ddots &\vdots \\a_{m1}b_{11}+\cdots +a_{mn}b_{n1}&a_{m1}b_{12}+\cdots +a_{mn}b_{n2}&\cdots &a_{m1}b_{1p}+\cdots +a_{mn}b_{np}\\\end{pmatrix}}

source: Matrix Multiplication on Wikipedia

Code To Investigate

Demo Video

SequentialMatrixMultiplier

class:	SequentialMatrixMultiplier.java	DEMO:
methods:	multiply
package:	matrixmultiply.demo
source folder:	src/demo/java

SequentialMatrixMultiplierClient

class:	SequentialMatrixMultiplierClient.java	DEMO:
methods:	main
package:	matrixmultiply.client
source folder:	src/demo/java

MatrixMultiplyApp

class:	MatrixMultiplyApp.java	VIZ
package:	matrixmultiply.viz
source folder:	student/src/demo/java

The Core Questions

What are the tasks?
What is the data?
Is the data mutable?
If so, how is it shared?

Code To Implement

There are three methods you will need to implement, all of which are different ways to use parallel for loops to solve the problem. To assist you, the sequential implementation has been implemented in a demo video.

LoopLoopMatrixMultiplier

class:	LoopLoopMatrixMultiplier.java
methods:	multiply
package:	matrixmultiply.exercise
source folder:	student/src/main/java

method: public double[][] multiply(double[][] a, double[][] b) (parallel implementation required)

In this implementation, you will simply convert the sequential solution into a parallel one using two nested parallel fork loops.

Computation Graph

For 3x3 Matrix X 3x3 Matrix:

Loop2dMatrixMultiplier

class:	Loop2dMatrixMultiplier.java
methods:	multiply
package:	matrixmultiply.exercise
source folder:	student/src/main/java

method: public double[][] multiply(double[][] a, double[][] b) (parallel implementation required)

join_void_fork_loop_2d

Computation Graph

For 3x3 Matrix X 3x3 Matrix:

Loop2dAutoCoarsenMatrixMultiplier

class:	Loop2dAutoCoarsenMatrixMultiplier.java
methods:	multiply
package:	matrixmultiply.exercise
source folder:	student/src/main/java

method: public double[][] multiply(double[][] a, double[][] b) (parallel implementation required)

join_void_fork_loop_2d_auto_coarsen

This implementation will look very similar to the previous one, so don't overthink it! The real benefit can be seen in the performance difference between the two based on the coarsening being done behind the scenes.

Extra Credit Challege Divide and Conquer

Divide and Conquer Matrix Multiplication

Testing Your Solution

Correctness

class:	__MatrixMultiplyTestSuite.java
package:	matrixmultiply.studio
source folder:	testing/src/test/java

Performance

class:	MatrixMultiplicationTiming.java
package:	matrixmultiply.performance
source folder:	src/main/java

Investigate the performance difference for your different implementations. When you run MatrixMultiplicationTiming it will put a CSV of the timings into your copy buffer. You can then paste them into a spreadsheet and chart the performance. Feel free to tune the parameters of the test to see the impacts of, for example, different matrix sizes.

Pledge, Acknowledgments, Citations

file:

matrix-multiply-pledge-acknowledgments-citations.txt

More info about the Honor Pledge

@@ Line 31: / Line 31: @@
 <youtube>iEuYiy1Bx2A</youtube>
 ==SequentialMatrixMultiplier==
-{{CodeToInvestigate|SequentialMatrixMultiplier|multiply|matrixmultiply.demo}}
+{{CodeToInvestigate|SequentialMatrixMultiplier|multiply|matrixmultiply.demo|demo}}
 ==SequentialMatrixMultiplierClient==
-{{CodeToInvestigate|SequentialMatrixMultiplierClient|main|matrixmultiply.client}}
+{{CodeToInvestigate|SequentialMatrixMultiplierClient|main|matrixmultiply.client|demo}}
+==MatrixMultiplyApp==
+{{Viz|MatrixMultiplyApp|matrixmultiply.viz|demo}}
+[[File:Martix multiply app 3x5 X 5x4.png|800px]]
 =The Core Questions=
@@ Line 44: / Line 49: @@
 =Code To Implement=
-There are three methods you will need to implement, all of which are different ways to use parallel for loops to solve the problem. To assist you, the sequential implementation has already been completed for you. We recommend starting from the top and working your way down. There is also an optional recursive implementation and a manual grouping implementation which has been done for you (this is just to demonstrate how chunking works behind the scenes).
+There are three methods you will need to implement, all of which are different ways to use parallel for loops to solve the problem. To assist you, the [https://classes.engineering.wustl.edu/cse231/core/index.php?title=MatrixMultiply#SequentialMatrixMultiplier sequential implementation] has been implemented in a [[#Demo_Video|demo video]].
 ==LoopLoopMatrixMultiplier==
@@ Line 51: / Line 56: @@
 {{Parallel|public double[][] multiply(double[][] a, double[][] b)}}
-In this implementation, you will simply convert the sequential solution into a parallel one using two nested parallel fork loops.
+In this implementation, you will simply convert the sequential solution into a parallel one using two nested [https://www.cse.wustl.edu/~dennis.cosgrove/courses/cse231/spring22/apidocs/fj/FJ.html#join_void_fork_loop(int,int,fj.api.TaskIntConsumer) parallel fork loops].
+=== Computation Graph ===
+For 3x3 Matrix X 3x3 Matrix:
+[[File:LoopLoopMatrixMultiplier_Computation_Graph.svg|800px]]
 ==Loop2dMatrixMultiplier==
@@ Line 61: / Line 72: @@
 In this implementation, we will cut down the syntax of the two forall implementation with the use of V5’s <code>forall2d</code> method. Functionally, this method serves the purpose of using two forall loops. [[Reference_Page#Forall_2d|Take a look at the reference page]] if you have questions on how to utilize this loop.
 -->
+[https://www.cse.wustl.edu/~dennis.cosgrove/courses/cse231/spring22/apidocs/fj/FJ.html#join_void_fork_loop_2d(int,int,int,int,fj.api.TaskBiIntConsumer) join_void_fork_loop_2d]
+=== Computation Graph ===
+For 3x3 Matrix X 3x3 Matrix:
+[[File:Loop2dMatrixMultiplier_Computation_Graph.svg|800px]]
 ==Loop2dAutoCoarsenMatrixMultiplier==
@@ Line 67: / Line 86: @@
 {{Parallel|public double[][] multiply(double[][] a, double[][] b)}}
+[https://www.cse.wustl.edu/~dennis.cosgrove/courses/cse231/spring22/apidocs/fj/FJ.html#join_void_fork_loop_2d_auto_coarsen(int,int,int,int,fj.api.TaskBiIntConsumer) join_void_fork_loop_2d_auto_coarsen]
+This implementation will look very similar to the previous one, so don't overthink it! The real benefit can be seen in the performance difference between the two based on the coarsening being done behind the scenes.
 <!--
 In this implementation, we will add a minor performance boost to the process by using the forall-chunked construct. Although similar to the traditional forall loop, it increases performance using iteration grouping/chunking. This topic is discussed in detail in this [https://edge.edx.org/courses/RiceX/COMP322/1T2014R/courseware/a900dd0655384de3b5ef01e508ea09d7/6349730bb2a149a0b33fa23db7afddee/?activate_block_id=i4x%3A%2F%2FRiceX%2FCOMP322%2Fsequential%2F6349730bb2a149a0b33fa23db7afddee Rice video] and explained in the [[Reference_Page#Parallel_Loops|V5 documentation]]. There is no need to specify anything, allow the runtime to determine the chunking.
@@ Line 74: / Line 96: @@
 Use chunking.  It is a nice feature.
 -->
+=Extra Credit Challege Divide and Conquer=
+[[Matrix_Multiply_Divide_and_Conquer_Assignment|Divide and Conquer Matrix Multiplication]]
 =Testing Your Solution=
@@ Line 80: / Line 105: @@
 ==Performance==
-{{Performance|MatrixMultiplicationTiming|matrixmultiply.studio}}
+{{Performance|MatrixMultiplicationTiming|matrixmultiply.performance}}
+Investigate the performance difference for your different implementations.  When you run MatrixMultiplicationTiming it will put a CSV of the timings into your copy buffer.  You can then paste them into a spreadsheet and chart the performance.  Feel free to tune the parameters of the test to see the impacts of, for example, different matrix sizes.
+[[File:Matrix multiply performance.png]]
+=Pledge, Acknowledgments, Citations=
+{{Pledge|matrix-multiply}}

Difference between revisions of "MatrixMultiply"

Latest revision as of 00:21, 14 February 2023

Contents

Motivation

Background

Code To Investigate

Demo Video

SequentialMatrixMultiplier

SequentialMatrixMultiplierClient

MatrixMultiplyApp

The Core Questions

Code To Implement

LoopLoopMatrixMultiplier

Computation Graph

Loop2dMatrixMultiplier

Computation Graph

Loop2dAutoCoarsenMatrixMultiplier

Extra Credit Challege Divide and Conquer

Testing Your Solution

Correctness

Performance

Pledge, Acknowledgments, Citations

Navigation menu

Personal tools

Namespaces

Variants

Views

More

Search

Navigation

General

Exercises and Warmups

Tools