Implement and check the time to do a matrix-matrix product of a 1500x900 matrix A with a 900x1200 matrix B of doubles using 1,2,3,4,5,6, 7 and 8 threads on a gollum node.
Define Aij = (i+1)*(j+1) and Bij = 1/((double) (i+1)* (double)(j+1)).
The result matrix C=A*B should be Cij= 900*(double)(i+1)/(double) (j+1) . You should check that the result is correct in each case by comparing A*B with a matrix C with these values.
Modify the code to run similar tests using static, dynamic with chunk size 1 and guided scheduling with chunk size 10.
List the various run times and speedup in seperate tables for each type of scheduling and comment on the speedup of each scheduling type and draw conclusions as to which is preferable (and in what circumstances).
Note that to implement this program you will need to increase the default stack size for both the original thread and the OMP threads. If you do not you will get a segmentation error. To do this you need to execute shell commands similar to:
ulimit -s unlimited