# 583L15 - EECS 583 Class 15 Register Allocation University...

This preview shows pages 1–8. Sign up to view the full content.

EECS 583 – Class 15 Register Allocation University of Michigan November 2, 2011

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document
- 1 - Announcements + Reading Material Midterm exam: Monday, Nov 14? » Could also do Wednes Nov 9 (next week!) or Wednes Nov 16 (2 wks from now) » Class vote Today’s class reading » “Register Allocation and Spilling Via Graph Coloring,” G. Chaitin, Proc. 1982 SIGPLAN Symposium on Compiler Construction, 1982. Next class reading – More at the end of class » “Revisiting the Sequential Programming Model for Multi-Core,” M. J. Bridges, N. Vachharajani, Y. Zhang, T. Jablin, and D. I. August, Proc 40th IEEE/ACM International Symposium on Microarchitecture , December 2007.
- 2 - Homework Problem – Answers in Red latencies: add=1, mpy=3, ld = 2, st = 1, br = 1 for (j=0; j<100; j++) b[j] = a[j] * 26 1: r3 = load(r1) 2: r4 = r3 * 26 3: store (r2, r4) 4: r1 = r1 + 4 5: r2 = r2 + 4 7: brlc Loop Loop: LC = 99 How many resources of each type are required to achieve an II=1 schedule? For II=1, each operation needs a dedicated resource, so: 3 ALU, 2 MEM, 1 BR If the resources are non-pipelined, how many resources of each type are required to achieve II=1 Instead of 1 ALU to do the multiplies, 3 are needed, and instead of 1 MEM to do the loads, 2 are needed. Hence: 5 ALU, 3 MEM, 1 BR Assuming pipelined resources, generate the II=1 modulo schedule. See next few slides

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document
- 3 - HW continued 1 2 3 4 5 7 1,1 3,0 2,0 1,1 1,1 1,1 1,1 RecMII = 1 RESMII = 1 MII = MAX(1,1) = 1 1: r3[-1] = load(r1[0]) 2: r4[-1] = r3[-1] * 26 3: store (r2[0], r4[-1]) 4: r1[-1] = r1[0] + 4 5: r2[-1] = r2[0] + 4 remap r1, r2, r3, r4 7: brlc Loop Loop: LC = 99 Dependence graph (same as example in class) 0,0 0,0 DSA converted code below (same as example in class) Assume II=1 so resources are: 3 ALU, 2 MEM, 1 BR Priorities 1: H = 5 2: H = 3 3: H = 0 4: H = 4 5: H = 0 7: H = 0
- 4 - HW continued resources: 3 alu, 2 mem, 1 br latencies: add=1, mpy=3, ld = 2, st = 1, br = 1 1: r3[-1] = load(r1[0]) 2: r4[-1] = r3[-1] * 26 3: store (r2[0], r4[-1]) 4: r1[-1] = r1[0] + 4 5: r2[-1] = r2[0] + 4 remap r1, r2, r3, r4 7: brlc Loop Loop: LC = 99 alu0 alu1 m2 br MRT 0 X 0 7 Rolled Schedule Unrolled Schedule 0 1 2 3 4 5 6 m1 alu2 Scheduling steps: Schedule brlc at time II-1 Schedule op1 at time 0 Schedule op4 at time 0 Schedule op2 at time 2 Schedule op3 at time 5 Schedule op5 at time 5 Schedule op7 at time 5 1 1 X X X X X 4 2 3 5 4 2 3 5 7 stage 1 stage 2 stage 3 stage 4 stage 5 stage 6

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document
- 5 - HW continued r3[-1] = load(r1[0]) if p1[0]; r4[-1] = r3[-1] * 26 if p1[2]; store (r2[0], r4[-1]) if p1[5]; r1[-1] = r1[0] + 4 if p1[0]; r2[-1] = r2[0] + 4 if p1[5]; brf Loop Loop: LC = 99 The final loop consists of a single MultiOp containing 6 operations, each predicated on the appropriate staging predicate. Note register allocation still needs to be performed.
- 6 - Register Allocation: Problem Definition Through optimization, assume an infinite number of virtual registers » Now, must allocate these infinite virtual registers to a limited

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document
This is the end of the preview. Sign up to access the rest of the document.

{[ snackBarMessage ]}

### Page1 / 35

583L15 - EECS 583 Class 15 Register Allocation University...

This preview shows document pages 1 - 8. Sign up to view the full document.

View Full Document
Ask a homework question - tutors are online