set-nv-org

set-nv-org - v-org-1 nv-org-1 NVIDIA GPU Microarchitecture...

Info iconThis preview shows pages 1–10. Sign up to view the full content.

View Full Document Right Arrow Icon

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: v-org-1 nv-org-1 NVIDIA GPU Microarchitecture These Notes: NVIDIA GPU Microarchitecture Current state of notes: Under construction. v-org-1 EE 7700-2 Lecture Transparency. Formatted 11:32, 29 April 2011 from set-nv-org. nv-org-1 v-org-2 nv-org-2 References References “NVIDIA GeForce 8800 GPU Architecture Overview,” NVIDIA Technical Brief TB-02787- 001 v01, November 2006. “NVIDIA CUDA Compute Unified Device Architecture Programming Guide Version 3.2,” 22 October 2010. v-org-2 EE 7700-2 Lecture Transparency. Formatted 11:32, 29 April 2011 from set-nv-org. nv-org-2 v-org-3 nv-org-3 Organization Overview Software Organization Overview CPU code runs on the host , GPU code runs on the device . A kernel consists of multiple threads . Threads execute in 32-thread groups called warps . Threads are grouped into blocks . A collection of blocks is called a grid . v-org-3 EE 7700-2 Lecture Transparency. Formatted 11:32, 29 April 2011 from set-nv-org. nv-org-3 v-org-4 nv-org-4 Hardware Organization Overview GPU chip consists of one or more multiprocessors . A multiprocessor consists of 1 (CC 1.x) or 2 (CC 2.x) schedulers . A multiprocessor consists of 8 to 48 CUDA cores . A multiprocessor consists of functional units of several types. GPU chip consists of one or more L2 Cache Units for mem access. Multiprocessors connect to L2 Cache Units via a crossbar. Each L2 Cache Unit has its own interface to device memory. v-org-4 EE 7700-2 Lecture Transparency. Formatted 11:32, 29 April 2011 from set-nv-org. nv-org-4 v-org-5 nv-org-5 Execution Overview Up to eight blocks are active in a multiprocessor. The scheduler chooses a warp for execution from active blocks. Over 2 to 32 cycles instructions in a warp are dispatched . Instructions in a warp are dispatched to functional units. The number of cycles to dispatch all instructions depends on . . . The number of functional units of the needed type. Possible resource contention. v-org-5 EE 7700-2 Lecture Transparency. Formatted 11:32, 29 April 2011 from set-nv-org. nv-org-5 v-org-6 nv-org-6 Storage Overview Device memory hosts a 32- or 64-bit global address space . Each MP has a set of temporary registers split amongst threads. Instructions can access a cache-backed constant space . Instructions can access high-speed shared memory . Instructions can access low-speed local memory . Instructions can access global space through a low-speed [sic] texture cache using texture or surface spaces. v-org-6 EE 7700-2 Lecture Transparency. Formatted 11:32, 29 April 2011 from set-nv-org. nv-org-6 v-org-7 nv-org-7 Thread Organization Warp Block Grid v-org-7 EE 7700-2 Lecture Transparency. Formatted 11:32, 29 April 2011 from set-nv-org. nv-org-7 v-org-8 nv-org-8 Multiprocessor Functional Units Scheduler Memory v-org-8 EE 7700-2 Lecture Transparency. Formatted 11:32, 29 April 2011 from set-nv-org. nv-org-8 v-org-9 nv-org-9 Register Types Thread Register Types General Purpose Registers: R0-......
View Full Document

This note was uploaded on 01/03/2012 for the course EE 7700 taught by Professor Staff during the Spring '08 term at LSU.

Page1 / 68

set-nv-org - v-org-1 nv-org-1 NVIDIA GPU Microarchitecture...

This preview shows document pages 1 - 10. Sign up to view the full document.

View Full Document Right Arrow Icon
Ask a homework question - tutors are online