hw2_Naser - CSE721 Winter 2011 Homework 2 Naser Sedaghati...

Info iconThis preview shows pages 1–4. Sign up to view the full content.

View Full Document Right Arrow Icon
CSE721 Winter 2011 Homework 2 Naser Sedaghati (200116698)
Background image of page 1

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Problem1: 4.3 from text 1) Run time for case a): Standard all-to-all broadcast time on the ring: T = (P – 1) (t s + m* t w ) 2) Run time for case b): Considering all-to-all algorithm of hypercube implemented on the ring, we have total of log(P) steps and at each step i, we have K = 2 i-1 hops for the each message to traverse and each node has L = 2 log(p)-i number of messages to send out. This is when we assume that, having nodes labeled from 0 to P-1, links are unidirectional. So, if node x is sending to node y and x > y, the message has to go all the way around the ring to reach the x, instead of just going backwards. So, if x > y, number of hops for message being sent from x to y is just (y – x – 1) and for y to x is (P – x – y – 1). So, we have the following summation as a result of overall running time: T = Sum (t s + K * L * m * t w ) , for 1 < i < log(P) = log(P) * t s + P/2 * log(P) * m * t w = log(P) * (t s + m * P * t w /2) 3) Which one of a) or b) is better if m is large? If m is large, factor ts is not significant anymore, so a) and b) can be rewritten as following: a) T = (P – 1) * m * t w b) T = P * log(P) * m * t w / 2 The above equations proves that, when m is large enough, case a) will out-perform case b) because case a's complexity is O(P) and case b's complexity is O(P * log(P)). 4) Which one of a) or b) is better if m is very small? If m is very small (i.e. m = 1 word), and assuming t s = 100 * t w , we can rewrite a) and b) as following: a) T = (P-1) * t s + (P-1) * t w = 101 * (P – 1) * t w b) T = log(P) * t s + P * log(P) * t w / 2 = log(P) * (100 + P/2) * t w According to the above equations, here case a) out-performs case b) again because its running time complexity is O(P) comparing to O(P*log(P)) of the case b. However, in order to hold the condition T(b) > T(a), if we consider the constant 100, it imposes that P has to be a large number (around 2200) to make this condition possible. Since this still seems an unrealistic for number of processors, we may assume that case b) will perform better than case a. This can also be implied by ignoring tw coefficient and just comparing the times based on ts factor.
Background image of page 2
Problem2: 4.5 from text Suppose we have the 8 processors connected together using a complete binary tree, where all the inner nodes are switches and processors only appear in the leaves. The following figure shows such a configuration: Now, for this case where P = 8, the all-to-all communication algorithm works as following: step-1) Following pairs of nodes send out messages in a bi-directional way: (0 ↔ 4) , (1 ↔ 5), (2 ↔ 6), (3 ↔ 7) Note that, for example, (0 ↔ 4) means 0 sends message to processor 4 and at the same time, processor 4 sends a message back to processor 0. step-2) Now, the following pairs carry on the communication in this way:
Background image of page 3

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Image of page 4
This is the end of the preview. Sign up to access the rest of the document.

This note was uploaded on 03/08/2012 for the course CSE 721 taught by Professor Saday during the Winter '11 term at Ohio State.

Page1 / 10

hw2_Naser - CSE721 Winter 2011 Homework 2 Naser Sedaghati...

This preview shows document pages 1 - 4. Sign up to view the full document.

View Full Document Right Arrow Icon
Ask a homework question - tutors are online