Goal - Goal: Design a program that computes square matrix...

Info iconThis preview shows pages 1–2. Sign up to view the full content.

View Full Document Right Arrow Icon
Goal: Design a program that computes square matrix multiplication on GPU using CUDA. In particular, your implementation should obey the following requirements: 1. The program must be general enough to handle matrix sizes beyond the GPU capacity. 2. The GPU capacity should not be hardcoded, but should be queried during execution. 3. The kernel implementation should be such that the execution configuration (number of blocks and threads/block) affects the performance but not the results of the kernel invocation. NOTE: In this lab, you will not use SHARED MEMORY! The program will be tested on the workstations and the cuda1 server using the following matrix sizes: Hints: 1. Have a look at CUDA library reference website at: http://developer.download.nvidia.com/compute/cuda/3_0/toolkit/docs/online/ You will use the CUDA Runtime API: http://developer.download.nvidia.com/compute/cuda/3_0/toolkit/docs/online/group__C UDART.html In particular, the “Device Management” module contains functions that allow you to get
Background image of page 1

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Image of page 2
This is the end of the preview. Sign up to access the rest of the document.

Page1 / 3

Goal - Goal: Design a program that computes square matrix...

This preview shows document pages 1 - 2. Sign up to view the full document.

View Full Document Right Arrow Icon
Ask a homework question - tutors are online