This preview shows pages 1–5. Sign up to view the full content.
This preview has intentionally blurred sections. Sign up to view the full version.
View Full DocumentThis preview has intentionally blurred sections. Sign up to view the full version.
View Full Document
Unformatted text preview: 1 ECE 475/CS 416 Computer Architecture Hardware Support for Software Approaches Edward Suh C omputer S ystems L aboratory suh@csl.cornell.edu ECE 475/CS 416 Computer Architecture, Fall 2007 Prof. Suh Announcements 2 ECE 475/CS 416 Computer Architecture, Fall 2007 Prof. Suh Review Wideissue processors require lots of ILP to fill the pipeline Littles law How can compilers extract ILP? Basic pipeline scheduling Loop unrolling Software pipelining Trace scheduling Dependence analysis is a key for scheduling ECE 475/CS 416 Computer Architecture, Fall 2007 Prof. Suh Dependence Analysis In general how can a compiler determine if a loop is parallel? need to detect if there are any dependences Nearly all algorithms assume array indices are af f ne array index is af f ne if it can be written in the form a*i+ b a and b are constants and i is the loop index variable for multidimensional arrays, each index must be af f ne Deciding whether there is a dependence between two accesses is same as deciding whether two af f ne functions can have the same value for different indices within the bounds of the loop 3 ECE 475/CS 416 Computer Architecture, Fall 2007 Prof. Suh Dependence Analysis Algorithm Assume we store an element with an index value a*i + b and load an element with index c*i + d where i runs from m to n A dependence exists if both of the following hold there are 2 iterations j and k such that m <= j,k <= n there is a store to an element with index a*j+b and a later load from the same array element c*k+d . i.e. a*j + b = c*k + d Unfortunately, in general cannot determine at compile time a,b,c,d may not be known (could be in other arrays!) might be known at compile time but expensive to compute accesses depend on indices of multiple nested loops... ECE 475/CS 416 Computer Architecture, Fall 2007 Prof. Suh GCD Test Luckily, many programs use indices known at compile time a,b,c,d are often constants One simple and ef f cient test is the GCD test GCD is the greatest common divisor (remember 4 th grade?) Observation: if a loopcarried dependence exists then: GCD( c,a ) must divide ( db ) In other words, ( db )/GCD( c,a ) is an integer Why? well if a*j + b = c*k + d then a*j c*k = ( db ) therefore ( a /GCD( a,c ) * j c /GCD( a,c ) * k ) = ( db )/GCD(a,c) implies GCD( a,c ) divides ( db ) Q.E.D. 4 ECE 475/CS 416 Computer Architecture, Fall 2007 Prof. Suh Example Is the following loop parallel? for (j=0; j <=100; j++) { A[2*j+3] = A[2*j] * 5.0; } First, are the indices af f ne? what are values of a,b,c,d ? what is GCD( a,c )? ( db )? does GCD( a,c ) divide db ? The GCD test is suf f cient to guarantee no dependences but it is not necessary (there are false positives) can anyone think of why?...
View
Full
Document
This note was uploaded on 02/19/2008 for the course ECE 4750 taught by Professor Suh during the Fall '07 term at Cornell University (Engineering School).
 Fall '07
 SUH
 Computer Architecture

Click to edit the document details