l8-handout

l8-handout - Lecture 8 Software Pipelining I . Introduction...

Info iconThis preview shows pages 1–4. Sign up to view the full content.

View Full Document Right Arrow Icon
1 Carnegie Mellon Lecture 8 Software Pipelining I. Introduction II. Problem Formulation III. Algorithm Reading: Chapter 10.5 – 10.6 M. Lam CS243: Software Pipelining 1 Carnegie Mellon I. Example of DoAll Loops Machine: Per clock: 1 read , 1 write , 1 (2-stage) arithmetic op , with hardware loop op and auto-incrementing addressing mode. Source code: For i = 1 to n D[i] = A[i] * B[i]+ c Code for one iteration: 1. LD R5,0(R1++) 2 . L D R 6 , 0 ( R 2 +
Background image of page 1

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
2 Carnegie Mellon Unrolling 1.L: LD 2. LD 3. LD 4. MUL LD 5. MUL LD 6. ADD LD 7. ADD LD 8. ST MUL LD 9. MUL 10. ST ADD 11. ADD 12. ST 13. ST BL (L) Let u be the degree of unrolling : Length of u iterations = 7+2( u -1) Execution time per source iteration = (7+2( u -1)) / u = 2 + 5/ u M. Lam CS243: Software Pipelining 3 Carnegie Mellon Software Pipelined Code 1. LD 2. LD 3. MUL LD 4. LD 5. MUL LD 6. ADD LD 7. MUL LD 8. ST ADD LD 9. MUL LD 10. ST ADD LD 11. MUL 12. ST ADD 13. 14. ST ADD 15. 16. ST Unlike unrolling, software pipelining can give optimal result. Locally compacted code may not be globally optimal DOALL: Can fill arbitrarily long pipelines with infinitely many iterations M. Lam CS243: Software Pipelining 4
Background image of page 2
3 Carnegie Mellon Example of DoAcross Loop Loop: Sum = Sum + A[i]; B[i] = A[i] * c; Software Pipelined Code 1 . L D 2 . M U L 3. ADD LD 4. ST MUL 5 . A D D 6 . S T Doacross loops Recurrences can be parallelized Harder to fully utilize hardware with large degrees of parallelism M. Lam CS243: Software Pipelining 5 1. LD 2. MUL 3. ADD 4. ST Carnegie Mellon II. Problem Formulation Goals: maximize throughput small code size Find: an identical relative schedule S(n) for every iteration a constant initiation interval (T) such that the initiation interval is minimized Complexity: NP-complete in general M. Lam
Background image of page 3

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Image of page 4
This is the end of the preview. Sign up to access the rest of the document.

Page1 / 11

l8-handout - Lecture 8 Software Pipelining I . Introduction...

This preview shows document pages 1 - 4. Sign up to view the full document.

View Full Document Right Arrow Icon
Ask a homework question - tutors are online