w12Color

# w12Color - Loop Parallelization Techniques and dependence...

This preview shows pages 1–9. Sign up to view the full content.

1 Loop Parallelization Techniques and dependence analysis • Data-Dependence Analysis • Dependence-Removing Techniques • Parallelizing Transformations • Performance-enchancing Techniques

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document
When can we run code in parallel? • Two regions of code can be run in parallel when no dependences exist across statements to be run in parallel 2 for (i = 0; i < n; i++) { c[i] = a[i] * b[i] + c[i] } a = b + c x = y + z u = a + x
3 Some motivating examples do i = 1, n a(i) = b(i) S 1 c(i) = a(i-1) S 2 end do Is it legal to Run the i loop in parallel? Put S 2 first in the loop? do I = 1, n a(i) = b(i) end do do I = 1, n c(i) = a(i-1) end do Is it legal to Fuse the two i loops? Need to determine if, and in what order, two references access the same memory location Then can determine if the references might execute in a different order after some transformation.

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document
4 Dependence, an example do i = 1, n a(i) = b(i) S 1 c(i) = a(i-1) S 2 end do i = 1 b(1) a(1) a(0) c(1) i = 2 b(2) a(2) a(1) c(2) i = 3 b(3) a(3) a(2) c(3) i = 4 b(4) a(4) a(3) c(4) i = 5 b(5) a(5) a(4) c(5) i = 6 b(6) a(6) a(5) c(6) Indicates dependences, i.e. the statement at the head of the arc is somehow dependent on the statement at the tail
5 Can this loop be run in parallel? do i = 1, n a(i) = b(i) S 1 c(i) = a(i-1) S 2 end do i = 1 b(1) a(1) a(0) c(1) i = 2 b(2) a(2) a(1) c(2) i = 3 b(3) a(3) a(2) c(3) i = 4 b(4) a(4) a(3) c(4) i = 5 b(5) a(5) a(4) c(5) i = 6 b(6) a(6) a(5) c(6) Assume 1 iteration per processor, then if for some reason some iterations execute out of lock-step, bad things can happen In this case, read of a(2) in i=3 will get an invalid value! time

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document
6 Can we change the order of the statements? do i = 1, n a(i) = b(i) S 1 c(i) = a(i-1) S 2 end do do i = 1, n c(i) = a(i-1) S 2 a(i) = b(i) S 1 end do a(0) c(1) b(1) a(1) || a(1) c(2) b(2) a(2) || a(2) c(3) b(3) a(3) || a(3) c(4) b(4) a(4) No problem with a serial execution. b(1) a(1) a(0) c(1) || b(2) a(2) a(1) c(2) || b(3) a(3) a(2) c(3) || b(4) a(4) a(3) c(4 ) Access order before statement reordering i=1 i=2 i=3 i=4 i=1 i=2 i=3 i=4 Access order after statement reordering
7 Can we fuse the loop? do i = 1, n a(i) = b(i) S 1 end do do i c(i) = a(i-1) S 2 end do do i = 1, n a(i) = b(i) S 1 c(i) = a(i-1) S 2 end do In original execution of the unfused loops : 1. a(i-1) gets value assigned in a(i) 2. Can’t overwrite value assigned to a(i) or c(i) 3. B(i) value comes from outside the loop 1. Is ok after fusing, because get a(i-1) from the value assigned in the previous iteration 2. No “output” dependence on a(i) or c(i), not overwritten 3. No input flow, or true dependence on a b(i), so value comes from outside of the loop nest

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document
8 Types of dependence a(2) = … … = a(2) Flow or true dependence – data for a read comes from a previous write (write/read hazard in hardware terms … = a(2) a(2) = … a(2) = … a(2) = … Anti-dependence – write to a location cannot occur before a previous read is finished Output dependence – write a location must wait for
This is the end of the preview. Sign up to access the rest of the document.

{[ snackBarMessage ]}

### Page1 / 53

w12Color - Loop Parallelization Techniques and dependence...

This preview shows document pages 1 - 9. Sign up to view the full document.

View Full Document
Ask a homework question - tutors are online