Ch07-AdvCompArch-ManycoresAndGPUs-PaulKelly-V03

Completely dierent address being accessed by each

Info iconThis preview shows page 1. Sign up to view the full content.

View Full Document Right Arrow Icon
This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: oad, store and texture) –  Unified path to global for loads and stores Shared Memory L1 Cache L2 Cache Global Memory Perhaad Mistry & Dana Schaa, Northeastern Univ Computer Architecture Research Lab, with Ben 33 Intel’s Larrabee – Xeon Phi •  •  •  •  •  Project to build a high ­end GPU that bridges the gap to conven(onal mul(core Each core is a simple in ­order 4 ­way SMT x86 Extended with a SIMD instruc(on set (16 floats wide) Special ­purpose hardware for texture cache, it much else Both L1 (32KB per core) and L2 (256KB per core) data caches are coherent •  Larrabee GPU project cancelled but re ­emerged as a compute accelerator  ­ Intel's Many Integrated Core (MIC), codenamed “Knights Corner”, launched as “Xeon Phi” Diagram: Intel SIMT vs SIMD – GPUs without the hype •  GPUs combine many architectural tec...
View Full Document

Ask a homework question - tutors are online