Final Exam Solutions

In any event each of the 3 steps of the reduction

Info iconThis preview shows page 1. Sign up to view the full content.

View Full Document Right Arrow Icon
This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: elements that are zero (3) Reduction tree: • In parallel, have each odd-numbered processor Px send its count to its neighbor P(x-1), and have the neighbor compute a subtotal • In parallel, have P2 send its count to P0 and P6 to P4; have P0 and P4 compute subtotals. • Have P4 send its count to P0, and have P0 compute the final total. (b) How much time will each step take? (1) We need to send (N/8) elements from P0 to each of 7 processors. It takes X cycles to send A elements: 7 * X * ceil(N/8A) (2) Each processor must compare each of its N/8 elements to zero. As stated in the problem, we do not need to worry about the time to increment the counter. All processors do this step in parallel, so we only need the time for a single processor to do this step: Y * (N/8) (3) For the reduction tree, the amount of time to compute a subtotal is not specified; full credit was given for either Y or zero cycles. In any event, each of the 3 steps of the reduction tree requires one value to be communicated, which takes X cycles (= X*ceil(1/A)), so the answer is either 3X or 3(X+Y), depending on your assumptions. 9 9.3 9-17 Multiprocessors Connected by a Single Bus Invalid (not valid cache block) Processor write miss Processor read miss Processor miss Processor miss (write dirty block to memory) at (S d en d ali inv e) Shared (clean) Processor read hit Processor write hit Modified (dirty) Processor read hit or write hit a. Cache state...
View Full Document

This note was uploaded on 02/08/2014 for the course CS 351 taught by Professor Dr.suzannerivoire during the Fall '13 term at Sonoma.

Ask a homework question - tutors are online