Great Deal! Get Instant $10 FREE in Account on First Order + 10% Cashback on Every Order Order Now

hw2 Objectives · Gain and understanding of sustained memory bandwidth · Become familiar with the effect of data size on sustained memory bandwidth · Become familiar with the STREAM benchmark ·...

1 answer below »
hw2
Objectives
· Gain and understanding of sustained memory bandwidth
· Become familiar with the effect of data size on sustained memory bandwidth
· Become familiar with the STREAM benchmark
· Use STREAM to predict performance for Sparse Matrix-Vector product
Tasks Part 1:
Write a program that measures the performance of this operation (in C):
double c[n], a[n];
for (k=0; kfor (i=0; ic[i] = a[i];
See note below
}
This is a simple copy from an a
ay a to an a
ay c. Fill in the following table:
· Time for loop (s)
· Rate (MB/s)
(The note below) Depending on your compiler and the optimization settings, you may need to keep the compiler from eliminating code that it believes unnecessary. Add a call to a routine
void dummy(double *a, double *c) {}
after the copy loop but within the loop over k. Place this routine in a separate file (that keeps the compiler from finding and eliminating it as well).
Submit
Plot the results, using a semilog plot. Turn in both the table and the plot. Also report on which system you ran your test. Submit your code with your table.
    n
    Time for loop (s)
    Rate (MB/s)
    250
    
    
    1000
    
    
    5000
    
    
    10000
    
    
    50000
    
    
    100000
    
    
    500000
    
    
    1000000
    
    
    5000000
    
    
Note
A MB is a Megabyte = 106 bytes (note: 220 bytes is a Mebibyte, written MiB, not a Megabyte). To time the loop, you may use any accurate timer. You may also look at the code in Part 2 and use the same timer used for that code.
When timing the loop, pick nk such that the time measured for both loops is about one second. Report the total time divided by nk as the “Time for loop”. When computing the rate, count each byte written or read as a separate byte. I.e., for n=10 and for 8 byte doubles, this copy loop moves n*8*2=10*8*2=160 bytes.
Part 2:
Download the STREAM benchmark from http:
www.cs.virginia.edu/stream/ and run it. You may use either the C or Fortran version. Make sure that the a
ay size is chosen so that it is at least 4 times the size of the cache on the processor that you are using. Report the results and indicate which version you ran (C or Fortran) in submission.
Part 3:
Using the COPY bandwidth from STREAM (Part 2) to compute a performance expectation for the sparse matrix-vector multiply operation. Assume that the read and write bandwidths are the same. Turn in the formula that you used, the values of the parameters, and the performance expectation in GigaFLOPS.
Think about the following (but you don’t need to turn in):
1. How could you estimate a value of nk given an assumption that memory bandwidth is between 1 and 20 Gigabytes/sec?
2. How could you choose N to efficiently plot the different regions of memory performance?
3. How does the performance expectation in Part 3 compare to (a) the peak performance of your processor and (b) the performance of a sparse matrix-vector multiply (if you want to do this, either write a simple test program or use a li
ary routine that performs a sparse matrix-vector multiply).
Answered Same Day Oct 31, 2022

Solution

Jahir Abbas answered on Nov 01 2022
59 Votes
SOLUTION.PDF

Answer To This Question Is Available To Download

Related Questions & Answers

More Questions »

Submit New Assignment

Copy and Paste Your Assignment Here