cs140-machines

cs140-machines - i = 1 1 2 3 4 Sub problem A i = n i = 1 i...

Info iconThis preview shows pages 1–10. Sign up to view the full content.

View Full Document Right Arrow Icon
Parallel Computers Today Parallel Computers Today Oak Ridge / Cray Jaguar > 1.75 PFLOPS Two Nvidia 8800 GPUs > 1 TFLOPS Intel 80- core chip > 1 TFLOPS TFLOPS = 10 12 floating point ops/sec PFLOPS = 1,000,000,000,000,000 / sec (10 15 )
Background image of page 1

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Supercomputers 1976: Supercomputers 1976: Cray-1, Cray-1, 133 MFLOPS (10 133 MFLOPS (10 6 ) )
Background image of page 2
Trends in processor clock speed Trends in processor clock speed
Background image of page 3

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
AMD Opteron 12-core chip AMD Opteron 12-core chip
Background image of page 4
AMD Opteron 6-core layout detail AMD Opteron 6-core layout detail
Background image of page 5

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
The nVidia G80 GPU The nVidia G80 GPU 128 streaming floating point processors @1.5Ghz 1.5 Gb Shared RAM with 86Gb/s bandwidth 500 Gflop on one chip (single precision)
Background image of page 6
More Detail on GPU Architecture
Background image of page 7

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Cray XMT Cray XMT (highly multithreaded (highly multithreaded shared memory) shared memory) i = n i = 3 i = 2
Background image of page 8
Background image of page 9

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Background image of page 10
This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: i = 1 . . . 1 2 3 4 Sub- problem A i = n i = 1 i = 0 . . . Sub- problem B Subproblem A Serial Code Unused streams . . . . Programs running in parallel Concurrent threads of computation Hardware streams (128) Instruction Ready Pool; Pipeline of executing instructions • Top 500 List http://www.top500.org/list/2010/11/100 • Graph 500 List http://www.graph500.org/Results.html Generic Parallel Machine Architecture Generic Parallel Machine Architecture • Key architecture question: Where is the interconnect, and how fast? • Key algorithm question: Where is the data? Proc Cache L2 Cache L3 Cache Memory Storage Hierarchy Proc Cache L2 Cache L3 Cache Memory Proc Cache L2 Cache L3 Cache Memory p o t e n i a l r c s...
View Full Document

Page1 / 10

cs140-machines - i = 1 1 2 3 4 Sub problem A i = n i = 1 i...

This preview shows document pages 1 - 10. Sign up to view the full document.

View Full Document Right Arrow Icon
Ask a homework question - tutors are online