ece475-l17 - 1 ECE 475/CS 416 Computer Architecture-...

Info iconThis preview shows pages 1–5. Sign up to view the full content.

View Full Document Right Arrow Icon

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: 1 ECE 475/CS 416 Computer Architecture- Hardware Support for Software Approaches Edward Suh C omputer S ystems L aboratory suh@csl.cornell.edu ECE 475/CS 416 Computer Architecture, Fall 2007 Prof. Suh Announcements 2 ECE 475/CS 416 Computer Architecture, Fall 2007 Prof. Suh Review Wide-issue processors require lots of ILP to fill the pipeline Littles law How can compilers extract ILP? Basic pipeline scheduling Loop unrolling Software pipelining Trace scheduling Dependence analysis is a key for scheduling ECE 475/CS 416 Computer Architecture, Fall 2007 Prof. Suh Dependence Analysis In general how can a compiler determine if a loop is parallel? need to detect if there are any dependences Nearly all algorithms assume array indices are af f ne array index is af f ne if it can be written in the form a*i+ b a and b are constants and i is the loop index variable for multi-dimensional arrays, each index must be af f ne Deciding whether there is a dependence between two accesses is same as deciding whether two af f ne functions can have the same value for different indices within the bounds of the loop 3 ECE 475/CS 416 Computer Architecture, Fall 2007 Prof. Suh Dependence Analysis Algorithm Assume we store an element with an index value a*i + b and load an element with index c*i + d where i runs from m to n A dependence exists if both of the following hold there are 2 iterations j and k such that m <= j,k <= n there is a store to an element with index a*j+b and a later load from the same array element c*k+d . i.e. a*j + b = c*k + d Unfortunately, in general cannot determine at compile time a,b,c,d may not be known (could be in other arrays!) might be known at compile time but expensive to compute accesses depend on indices of multiple nested loops... ECE 475/CS 416 Computer Architecture, Fall 2007 Prof. Suh GCD Test Luckily, many programs use indices known at compile time a,b,c,d are often constants One simple and ef f cient test is the GCD test GCD is the greatest common divisor (remember 4 th grade?) Observation: if a loop-carried dependence exists then: GCD( c,a ) must divide ( d-b ) In other words, ( d-b )/GCD( c,a ) is an integer Why? well if a*j + b = c*k + d then a*j c*k = ( d-b ) therefore ( a /GCD( a,c ) * j c /GCD( a,c ) * k ) = ( d-b )/GCD(a,c) implies GCD( a,c ) divides ( d-b ) Q.E.D. 4 ECE 475/CS 416 Computer Architecture, Fall 2007 Prof. Suh Example Is the following loop parallel? for (j=0; j <=100; j++) { A[2*j+3] = A[2*j] * 5.0; } First, are the indices af f ne? what are values of a,b,c,d ? what is GCD( a,c )? ( d-b )? does GCD( a,c ) divide d-b ? The GCD test is suf f cient to guarantee no dependences but it is not necessary (there are false positives) can anyone think of why?...
View Full Document

This note was uploaded on 02/19/2008 for the course ECE 4750 taught by Professor Suh during the Fall '07 term at Cornell University (Engineering School).

Page1 / 13

ece475-l17 - 1 ECE 475/CS 416 Computer Architecture-...

This preview shows document pages 1 - 5. Sign up to view the full document.

View Full Document Right Arrow Icon
Ask a homework question - tutors are online