.. title: PAR Homework 3, due Thu  2021-02-15, noon
.. slug: homework03
.. tags: homework
.. date: 2021-02-07
.. category: 
.. link: 
.. description: 
.. type: text
.. has_math: true

Rules
-----

#. Submit the answers to Gradescope.

#. You may do homeworks in teams of 2 students.   Create a gradescope team and make one submission with both names.


Question
--------

#. The goal is to measure whether OpenMP actually makes matrix multiplication faster, with and w/o SIMD.

#. You may use anything in /parallel-class/openmp that seems useful.

#. Write a C++ program on parallel.ecse to initialize pseudorandomly and multiply two 100x100 float matrices.   One possible initialization:

     a[i][j] = i*3 + (j*j)%5;  b[i][j] = i*2 + (j*j)%7;

#. (10 points) Report the elapsed time.  Include the program listing.

#. Add an OpenMP pragma to do the work in parallel.

#. (10 points) Report the elapsed time, varying the number of threads thus: 1, 2, 4, 8, 16, 32, 64.

   What do you conclude?
   
#. (5 points) Repeat that two more times to see how consistent the times are.   

#. (10 points) Modify the pragma to use SIMD.

   Report the elapsed time, varying the number of threads thus: 1, 2, 4, 8, 16, 32, 64.

   What do you conclude?   

#. (10 points) Compile and run your program with two different levels of compiler optimization: O1 and O3, reporting the elapsed time.   Modify your program to prevent the optimizer from optimizing the program away to nothing.  E.g., print a few values.

#. (5 points) What do you conclude about everything?

Total: 40 pts.

