{[ promptMessage ]}

Bookmark it

{[ promptMessage ]}

PR2Solution - CS 6290 High-Performance Computer...

Info iconThis preview shows pages 1–3. Sign up to view the full content.

View Full Document Right Arrow Icon
CS 6290: High-Performance Computer Architecture Spring 2009 Project 2 Due: March 31 st (see T-Square) This project is intended to help you learn more about multi-core execution. You will submit a report for this project. To complete this project, you should know how to set up the simulator and run a simulation. You should also know how to compile a simple application using the cross-compiler. Project 1 has instructions on how to accomplish these tasks. Each part of this project assignment specifies what should be reported to complete that part of the project. As explained in the course rules, this is an individual project : no collaboration with other students or anyone else is allowed . Part 1 [25 points]: Running a parallel application Set up the simulator as explained in Project 1. This time, we will be compiling and running a parallel application called fft from the Splash-2 benchmark suite (which is often used to evaluate shared-memory parallel machines). First, make an fft directory in your ~/sim directory, then copy the source code of the application there: cd ~/sim mkdir fft cd fft cp /CS6290/fft.c . Now compile this application: /CS6290/mipsrt/cross-tools/bin/mips-unknown-linux-gnu-gcc -O3 -g -static -fno-delayed-branch -fno-optimize-sibling-calls -msplit- addresses -mabi=32 -march=mips4 -o fft.mipseb fft.c -lm -lpthread Note that we are using the same options as we did in Project 1, except that we are now telling the compiler to use the math library (-lm) and the pthread library (-lpthread) when linking the application. Now that we have a MIPS executable file for this application, we can run some simulations. The fft application has two main parameters, -m <size> and –p <nthreads>, where <size> is determines the problem size and <nthreads> specifies how many threads to use. This application performs a Fast Fourier Transform on an array that has 2 <size> elements, so be careful to specify <size> correctly – if you add 1 to the <size> parameter,
Background image of page 1

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full Document Right Arrow Icon
the simulation time will more than double, so you can accidentally end up with a simulation that runs for days or even months. Another thing that you must do is to change the configuration file (sesc.conf) and set the “NoMigration” option to false. It controls whether threads can move from one processor to another, and without migration two threads might forever compete for the same processor (can’t move to another) while some other processor remains unused. To complete this part of the project, run the fft application in the simulator using 1, 2, 4, 8, and 16 processors (cores), using <size> of 8 and then using <size> of 14, then answer the following: The execution times (SimTime from report.pl) are: -p 1 -p 2 -p 4 -p 8 -p 16 -m 8 0.150 msec 0.157 msec 0.161 msec 0.167 msec 0.182 msec -m 14 3.090 msec 2.137 msec 1.789 msec 1.656 msec 1.630 msec A) What is the parallel speedup with 2, 4, 8, and 16 processors with <size> of 8?
Background image of page 2
Image of page 3
This is the end of the preview. Sign up to access the rest of the document.

{[ snackBarMessage ]}