# hw5 - CS 6143 COMPUTER ARCHITECTURE II HOMEWORK V FALL 2010...

This preview shows pages 1–3. Sign up to view the full content.

CS 6143 COMPUTER ARCHITECTURE II FALL 2010 HOMEWORK V Polytechnic Institute of NYU Page 1 of 28 Handout No : 10 November 10, 2010 DUE : December 1, 2010 READ : ‚ Related portions of Chapters 2, 3, 4 and Appendix H of the Hennessy book ‚ Related portions of Chapter 1, 2, 3, 4 and 7 of the Jordan book ASSIGNMENT : There are six problems. Solve all homework and exam problems as shown in class and past exam solutions. 1) Define the following terms related to parallelism and computational methods : a) Degree of parallelism b) Computational granularity 2) Consider the following piece of high-level language loop : i) List the dependencies. ii) Rewrite the high-level language loop based on your dependence list so that there is loop-level parallelism. That is, all iterations are independent of each other such that loop body statements for all iterations can be performed in parallel. Therefore, it would not matter whether iteration 28 is done first or iteration 43, the result will be correct. Note that you will NOT compile the loop to an assembly code. 3) Develop a sequential Binary Search algorithm to search element “ k ” on one-dimensional array A whose “ n ” elements are already ordered. If the search is successful, you will return the “index” of the array, i.e. A[index] = k. Otherwise, you will return -1 . Specify the time complexity of your algorithm. What is the time complexity of your algorithm ? 4) Consider the sequential Binary Search algorithm worked on in Question 3 above. Convert the sequential binary search algorithm to a parallel binary search PRAM algorithm to search element k ” on one-dimensional vector “ A ” with “ n ” ordered elements on “ p ” processors. If a processor finds “k,” it returns “index” where A[index] = k. Note that “index” is a global variable. If “k” is for (i=1 ; i < 100 ; i = i + 1) { a[i] = b[i] + c[i] ; /* S1 */ b[i] = a[i] + d[i] ; /* S2 */ a[i+1] = a[i] + e[i] ; /* S3 */ }

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document
Polytechnic Institute of NYU Page 2 of 28 CS6143 Handout No : 10 November 10, 2010 not found by any processor, then, “index” contains -1. • Indicate the time complexity of the algorithm. Can it be cost efficient ? Explain. • Make observations relevant to the execution of your PRAM algorithm, including the data decomposition, load balancing, the communication graph, etc. 5) Develop a cost efficient dot-product PRAM algorithm on two vectors, “ A ” and “ B ” with ” p “processors. Store the result in “ k ” which is a global variable. In order to keep the result of each processor’s dot product result, use vector “ D ” with “p” elements. To store the result in “k,” pro- cessor 0 copies D[0] to k. • Indicate the time complexity of your algorithm. • Make observations relevant to the execution of your PRAM algorithm, including the data decomposition, load balancing, the communication graph, etc.
This is the end of the preview. Sign up to access the rest of the document.

## This note was uploaded on 02/02/2011 for the course CS 6143 taught by Professor Hadimioglu during the Fall '10 term at NYU Poly.

### Page1 / 28

hw5 - CS 6143 COMPUTER ARCHITECTURE II HOMEWORK V FALL 2010...

This preview shows document pages 1 - 3. Sign up to view the full document.

View Full Document
Ask a homework question - tutors are online