chapter5-m1-ziavras

01 speedup for specintrate benchmark and 107 for

Info iconThis preview shows page 1. Sign up to view the full content.

View Full Document Right Arrow Icon
This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: ual registers from 152 to 240 • Increased the size of several issue queues The Power5 core is about 24% larger than the Power4 core 19 Initial Performance of SMT • Pentium 4 Extreme SMT yields 1.01 speedup for SPECint_rate benchmark and 1.07 for SPECfp_rate – Pentium 4 is dual threaded SMT – SPECRate requires that each SPEC benchmark be run against a vendor-selected number of copies of the same benchmark • Running on Pentium 4 each of 26 SPEC benchmarks paired with every other speed speed-up: 0.90 to 1.58; average: 1.20 0.90 to 1.58; average: 1.20 • Power 5, 8 processor server 1.23 faster for SPECint_rate with SMT, 1.16 faster for SPECfp_rate • Power 5 running 2 copies of each app speed-up: 0.89-1.41 – Most gained some gained some – FP apps had most cache conflicts and least gains 20 Head to Head ILP competition Processor Micro architecture Fetch / Issue / Execute FU Clock Rate (GHz) Transis -tors Die size Power Intel Pentium 4 Extreme AMD Athlon 64 FX-57 IBM Power5 (1 CPU only only) Intel Itanium 2 Speculative dynamically scheduled; deeply pipelined; SMT Speculative dynamically scheduled Speculative dynamically scheduled; SMT; 2 CPU cores/chip CPU cores/chip Statically scheduled VLIW VLIW-style (EPIC) 3/3/4 7 int. int 1 FP 3.8 125 M 122 mm2 115 W 3/3/4 6 int. 3 FP 2.8 8/4/8 6 int. 2 FP 1.9 6/5/11 9 int. 2 FP 1.6 114 M 104 115 W mm2 200 M 80W 300 (est.) mm2 (est.) 592 M 130 423 W mm mm2 units 21 Performance on SPECint2000 Itanium 2 Pentium 4 AMD Athlon 64 Pow er 5 3500 3000 SPEC Ratio 2500 2000 15 0 0 10 0 0 500 0 gzip vpr gcc mcf cr af t y parser eon per lbmk gap vor t ex bzip2 t wolf 22 Performance on SPECfp2000 14000 Itanium 2 Pentium 4 AMD Athlon 64 Power 5 12000 SPEC Ratio 10000 8000 6000 4000 2000 0 w upw ise sw im mgrid applu mesa galgel art equake facerec ammp lucas fma3d sixtrack apsi 23 Normalized Performance: Efficiency 35 Itanium 2 Pentium 4 AMD Athlon 64 POWER 5 30 Power5 best in FP/Watt best in FP/Watt 25 Rank 20 Int/Trans FP/Trans 15 Int/area 10 FP/area Int/Watt 5 FP/Watt I P t e an nt i I uu mm 24 A t h l o n P o w e r 5 4 4 4 4 4 2 1 1 1 1 1 3 3 3 3 3 2 1 2 2 2 2 3 4 0 SPECInt / M SPECFP / M Transistors Transistors SPECInt / mm^2 SPECFP / mm^2 SPECInt / Watt SPECFP / Watt 24 No Silver Bullet for ILP • No obvious over-all leader in performance • The AMD Athlon leads on SPECInt performance followed by the Pentium 4, Itanium 2, and Power5 • Itanium 2 and Power5, which perform similarly on SPECFP clearly dominate the Athlon and SPECFP, clearly dominate the Athlon and Pentium 4 on SPECFP • Itanium 2 is the most inefficient processor both for FP and integer code for all but one efficiency FP measure (SPECFP/Watt) and Pentium both make good use of • Athlon and Pentium 4 both make good use of transistors and area i...
View Full Document

{[ snackBarMessage ]}

Ask a homework question - tutors are online