CudaPA1 - cs475 CUDA 1 code wim bohm cs csu vector add...

Info iconThis preview shows pages 1–7. Sign up to view the full content.

View Full Document Right Arrow Icon
cs475:’CUDA’1’code’ wim’bohm’ cs,’csu’
Background image of page 1

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full Document Right Arrow Icon
’vector’add’ A B C Global Memory Thread blocks access contiguous partitions of A, B, and C Threads access contiguous chunks in a partition grid: 1D, threadBlock: 1D shared shared shared
Background image of page 2
host create’host’and’device’vectors’ memcpy’input’vectors’to’device’ invoke’ <<<gridDim,blockDim>>>kernel(params)’ do’Eming’ memcpy’result’vectors’back’to’host’ check’results’
Background image of page 3

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full Document Right Arrow Icon
host:’cmd’line’interface // Defines #define GridWidth 60 #define BlockWidth 128 // Host code performs setup and calls the kernel. int main(int argc, char** argv) { int ValuesPerThread; // number of values per thread int N; // total Vector size sscanf(argv[1], "%d", &ValuesPerThread); N = ValuesPerThread * GridWidth * BlockWidth; size_t size = N * sizeof(float); // number of bytes for a vector. dim3 dimGrid(GridWidth); // grid dimensions dim3 dimBlock(BlockWidth); // thread block dimensions
Background image of page 4
host:’allocate // Allocate input vectors h_A and h_B in host memory h_A = (float*)malloc(size); if (h_A == 0) Cleanup(false); h_B = (float*)malloc(size); if (h_B == 0) Cleanup(false); h_C = (float*)malloc(size); if (h_C == 0) Cleanup(false); // Allocate vectors in device memory. cudaError_t error; error = cudaMalloc((void**)&d_A, size); if (error != cudaSuccess) Cleanup(false); error = cudaMalloc((void**)&d_B, size); if (error != cudaSuccess) Cleanup(false); error = cudaMalloc((void**)&d_C, size); if (error != cudaSuccess) Cleanup(false); // intialize host vectors h_A and h_B
Background image of page 5

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full Document Right Arrow Icon
host:’copy’host’data’to’device ’invoke’kernel’on’device copy’device’data’to’host’ // Copy host vectors h_A and h_B to device vectors d_A and d_B error = cudaMemcpy(d_A, h_A, size, cudaMemcpyHostToDevice); if (error != cudaSuccess) Cleanup(false); error = cudaMemcpy(d_B, h_B, size, cudaMemcpyHostToDevice); if (error != cudaSuccess) Cleanup(false); // Invoke kernel
Background image of page 6
Image of page 7
This is the end of the preview. Sign up to access the rest of the document.

{[ snackBarMessage ]}

Page1 / 21

CudaPA1 - cs475 CUDA 1 code wim bohm cs csu vector add...

This preview shows document pages 1 - 7. Sign up to view the full document.

View Full Document Right Arrow Icon
Ask a homework question - tutors are online