06b CUDA Basics (NVIDIA)

06b CUDA Basics (NVIDIA) - CUDA Basics © NVIDIA...

Info iconThis preview shows pages 1–11. Sign up to view the full content.

View Full Document Right Arrow Icon

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: CUDA Basics © NVIDIA Corporation 2009 CUDA A Parallel Computing Architecture for NVIDIA GPUs Supports standard languages and APIs • C • OpenCL • Fortran (PGI) • DX Compute Supported on common operating systems: • Windows • Mac OS • Linux DX Compute © NVIDIA Corporation 2009 3 Arrays of Parallel Threads A CUDA kernel is executed by an array of threads All threads run the same code Each thread has an ID that it uses to compute memory addresses and make control decisions 1 2 3 4 5 6 7 … float x = input[threadID]; float y = func(x); output[threadID] = y; … © NVIDIA Corporation 2009 3 Arrays of Parallel Threads A CUDA kernel is executed by an array of threads All threads run the same code Each thread has an ID that it uses to compute memory addresses and make control decisions 1 2 3 4 5 6 7 … float x = input[threadID]; float y = func(x); output[threadID] = y; … threadID © NVIDIA Corporation 2009 5 Example: Increment Array Elements CPU program CUDA program void increment_cpu( float *a, float b, int N) { for ( int idx = 0; idx<N; idx++) a[ idx] = a[idx] + b; } void main() { ..... increment_cpu(a, b, N); } __global__ void increment_gpu( float *a, float b, int N) { int idx = blockIdx.x * blockDim.x + threadIdx.x ; if (idx < N) a[idx] = a[idx] + b; } void main() { ….. dim3 dimBlock ( blocksize ); dim3 dimGrid( ceil( N / ( float ) blocksize ) ); increment_gpu<<<dimGrid, dimBlock>>>(ad,bd, N); } © NVIDIA Corporation 2009 Outline of CUDA Basics Basics Memory Management Basic Kernels and Execution on GPU Coordinating CPU and GPU Execution Development Resources See the Programming Guide for the full API Basic Memory Management © NVIDIA Corporation 2009 Memory Spaces CPU and GPU have separate memory spaces Data is moved across PCIe bus Use functions to allocate/set/copy memory on GPU Very similar to corresponding C functions Pointers are just addresses Can’t tell from the pointer value whether the address is on CPU or GPU Must exercise care when dereferencing: Dereferencing CPU pointer on GPU will likely crash Same for vice versa © NVIDIA Corporation 2009 GPU Memory Allocation / Release Host (CPU) manages device (GPU) memory: cudaMalloc (void ** pointer, size_t nbytes) cudaMemset (void * pointer, int value, size_t count) cudaFree (void* pointer) int n = 1024; int nbytes = 1024*sizeof(int); int * d_a = 0; cudaMalloc( (void**)&d_a, nbytes ); cudaMemset( d_a, 0, nbytes); cudaFree(d_a); © NVIDIA Corporation 2009 Data Copies cudaMemcpy ( void *dst, void *src, size_t nbytes, enum cudaMemcpyKind direction); returns after the copy is complete blocks CPU thread until all bytes have been copied doesn’t start copying until previous CUDA calls complete enum cu d aMemcpyKind cudaMemcpyHostToDevice cudaMemcpyDeviceToHost cudaMemcpyDeviceToDevice Non-blocking memcopies are provided © NVIDIA Corporation 2009 Code Walkthrough 1 Allocate CPU memory for n integers Allocate GPU memory for n integers Initialize GPU memory to 0s Copy from GPU to CPU Print the values © NVIDIA Corporation 2009...
View Full Document

This note was uploaded on 04/29/2010 for the course CSE 4190.410 taught by Professor Shinyeonggil during the Spring '09 term at Seoul National.

Page1 / 45

06b CUDA Basics (NVIDIA) - CUDA Basics © NVIDIA...

This preview shows document pages 1 - 11. Sign up to view the full document.

View Full Document Right Arrow Icon
Ask a homework question - tutors are online