3 Pages

Goal

Course: THERMO n/a , Spring 2011
School: Abraham Baldwin...
Rating:
 
 
 
 
 

Word Count: 609

Document Preview

a Goal: Design program that computes square matrix multiplication on GPU using CUDA. In particular, your implementation should obey the following requirements: 1. The program must be general enough to handle matrix sizes beyond the GPU capacity. 2. The GPU capacity should not be hardcoded, but should be queried during execution. 3. The kernel implementation should be such that the execution configuration (number...

Register Now

Unformatted Document Excerpt

Coursehero >> Georgia >> Abraham Baldwin Agricultural College >> THERMO n/a

Course Hero has millions of student submitted documents similar to the one
below including study guides, practice problems, reference materials, practice exams, textbook help and tutor support.

Course Hero has millions of student submitted documents similar to the one below including study guides, practice problems, reference materials, practice exams, textbook help and tutor support.
a Goal: Design program that computes square matrix multiplication on GPU using CUDA. In particular, your implementation should obey the following requirements: 1. The program must be general enough to handle matrix sizes beyond the GPU capacity. 2. The GPU capacity should not be hardcoded, but should be queried during execution. 3. The kernel implementation should be such that the execution configuration (number of blocks and threads/block) affects the performance but not the results of the kernel invocation. NOTE: In this lab, you will not use SHARED MEMORY! The program will be tested on the workstations and the cuda1 server using the following matrix sizes: Hints: 1. Have a look at CUDA library reference website at: http://developer.download.nvidia.com/compute/cuda/3_0/toolkit/docs/online/ You will use the CUDA Runtime API: http://developer.download.nvidia.com/compute/cuda/3_0/toolkit/docs/online/group__C UDART.html In particular, the Device Management module contains functions that allow you to get information about the hardware resources of the GPU in the system. 2. Requirement #3 implies that the sizes of the matrices must be passed as parameters to your matrix multiplication kernel. In other word, Cm,k = Am,n * Bn,k can be implemented by a kernel with the following signature: void matmul(float *a, float *b, float *c, unsigned m, unsigned n, unsigned k); 3. Use C++ compiler (g++) to compile your code, and use new operator instead of malloc() to dynamically allocate memory (malloc may fail on very large memory allocations). 4. Try to design the data structures so to minimize the number of memory transactions between host and device (CPU and GPU). 5. DO NOT START TO CODE IMMEDIATELY. Spend some time designing your solution. i. How do you handle matrix sizes exceeding the GPU capacity? ii. How do you represent the matrices? iii. Which memory transfers are involved with within matrices and beyond the GPU capacity? iv. When designing the kernel, which work is performed by each thread? How do you correlate each thread with the data it processes? 6. Kernel calls use shared memory and registers. Your kernel should not use shared memory. To see how many registers are used by each thread, you can have a look at the GPU assembly file.The assembly file (called PTX file) can be generated by calling: nvcc ptx myfile.cu This will generate myfile.ptx. The PTX file will show you the assembly representation of your kernel. In particular, it will show you the code execute by each thread (as you know, all threads execute the same code!). The PTX file will include an area where the registers are declared. For example: .reg .u16 %rh<4>;//16 bit registers .reg .u32 %r<9>; //32 bit registers .reg .u64 %rd<10>;//64 bit registers .reg .pred %p<3>; // registers used for predication If you know: how many registers are used by each thread and how many registers are available on the GPU, you can easily determine what is the maximum number of threads that you can run (for your particular kernel). 7. Use the occupancy calculator to calculate the optimal point for configuring the kernel. The occupancy calculator can be downloaded from: http://developer.download.nvidia.com/compute/cuda/CUDA_Occupancy_calculator.xls Questions: 1. Run your kernel with different number of blocks and of threads/block and see how this affects the performance. Report GPU occupancy and execution time, and discuss the results. Consider execution configurations that are trivially bad, and compare them with good execution configurations. How do you know in advance that some configurations are bad? And what is a good execution configuration? 2. What is the largest execution configuration that you can use without exceeding the resources available on the GPU in use?
Find millions of documents on Course Hero - Study Guides, Lecture Notes, Reference Materials, Practice Exams and more. Course Hero has millions of course specific materials providing students with the best way to expand their education.

Below is a small sample set of documents:

Abraham Baldwin Agricultural College - HRM - 46
SYMBOLICSYSTEMS202:TheRationalityDebate(3units)WinterQuarter20032004,StanfordUniversityInstructor:ToddDaviesGameTheoryThroughExamples(2/11/04)GamesagainstnaturedecisiontheoryforasingleagentExpectedutilitytheoryforasingleagentissometimescalledthetheory
Abraham Baldwin Agricultural College - MATH - 435
WrittenAssignment:M6ApplicationsoftheDerivativeMA302ExerciseSet4.1(14,24,34,44,46,74)ExerciseSet4.2(8,24,44)ExerciseSet4.3(12,24,30)ExerciseSet4.4(8,14,26)ExerciseSet4.5(6,12)
Abraham Baldwin Agricultural College - ENGLISH - 645
Madeline Hunter Lesson Plan TemplateEDU 490Name/Teacher:Subject Area:Grade Level:Lesson Title:1. Objectives (State the expected learner outcomes):Materials/Resources needed:2. Standards Addressed &amp; Expectations of the Student:3. Anticipatory Set
Abraham Baldwin Agricultural College - GGG - 444
Title of Paper 1Running Head: TITLE OF PAPER IN CAPSTitle of PaperYour NameSchoolClassInstructorDateTitle of Paper 2[Transmittal Letter Example taken fromhttp:/www.class.uidaho.edu/adv_tech_wrt/week14/letter_transmittal_example.htm . Usethe let
Abraham Baldwin Agricultural College - CHEMIS - N1234
The question is :Compare the properties of elements and compounds. Give two examples of each,including one example that exists in the human body. Explain the atomic natureof one element and one compound found in human body.Your responses should be a m
Abraham Baldwin Agricultural College - HRM - 345
Robert StrokaHuman Resource Management and DevelopmentProfessor BerensonNovember 27, 2011Diversity in the WorkplacePurpose: The purpose of this paper is to identify ways to create a non-discriminative andproductive workplace and positive ideas on ho
Abraham Baldwin Agricultural College - ERT - 1235
The purpose of this paper is to address how research is used in Commercial RealEstate specifically the relocation of businesses. Many companies rely on the informationresearch companies provide to make their final decisions. Realogy is one of thecompan
Abraham Baldwin Agricultural College - CSF - 433
I need homework helplisted are two questions. answer each question 300 words and onereference for each use the chapters for your reference.1. Chapter two includes insightful and challenging case studies for our consideration. Ourfirst discussion thread
Abraham Baldwin Agricultural College - GFDG - gfdg
ForModule2,consideryourorganization'smissionandstrategyfromtheperspectiveofitspotential, prospective,andpresentcustomers.Inthissectionoftheassignmentyoullbegintoidentifyobjectives andmeasuresrelevanttothatperspective.Referbacktothis presentationonobje
Abraham Baldwin Agricultural College - HRM - 45
The Scanlon plan is a variation of which type of incentive?A. IndividualB. Merit payC. GainsharingD. Profit sharingA system in which an employer pays a worker specifically for each unit produced isknown asA. gross pay.B. hourly wage.C. piecework
Abraham Baldwin Agricultural College - DAA - 456
FIN2030 Quantitative Assignment, Week 21. Future Value. What is the future value ofa. $773 invested for 14 years at 11 percent compounded annually?b. $210 invested for 7 years at 6 percent compounded annually?c. $650 invested for 10 years at 9 percent
Abraham Baldwin Agricultural College - DAA - 456
Multimeter Operations1. A technician needs to isolate an intermittent ground in a computer system. What color wire does thetechnician need to look for?A. Black C. OrangeB. Green D. Red2. Microelectronic parts in a computer operate onA. +120 volts AC
Abraham Baldwin Agricultural College - DFGASFD - 23445234
Write a 3-4 page paper that explains the concept of inclusion and evaluates three types of services that may benefitdisabled students. Reference three specific research-based sources (not including the text). Identify advantages anddisadvantages for the
Abraham Baldwin Agricultural College - FINANCE - 35
Present Value FunctionWhat is the present value of $1,000 received 8 years from today if thediscount rate is 5%?i5%n8FV$1,000PV$676.84 using the formulaPV (rate, nper, pmt, FV)=PV(D6,E6,0,D8)$676.84Future Value FunctionYou deposit $1,000 to
Abraham Baldwin Agricultural College - YGF - 567
In the social sciences, researchers often want to conduct studies in situations where they cannotcontrol certain aspects of the study. For example, they might want to compare critical thinkingskills for students taking an online class with those taking
Abraham Baldwin Agricultural College - FSF - PSY435WEEK
PSY 435 Week 4 Test1. To control rater bias in a performance rating system organizations are recommended to conduct propertraining of raters and _.A. Allow ratees to sign for their appraisalsB. Have supervisors of the raters also sign to validate the
Abraham Baldwin Agricultural College - FESF - 4325
Abraham Baldwin Agricultural College - ENGG - 54
Week5Assignment2:ArticleAnalysisInthisassignment,youwillcriticallyevaluatearticlesinthefieldofadultdevelopment.Eachweek,youwillreadtwoarticles fromAnnualEditions:HumanDevelopment9/10(seetheweeklyreadingsforthechosenarticles).Foreacharticle,dothefollowi
Abraham Baldwin Agricultural College - GG - 435
Thank you for your interest in iGATE Patni. This form is intended to enableyou to record important points about yourself as a person, your experience,strengths, achievements and future plans for our reference while processingyour application.APPLICATI
Iowa State - ENG - 160
Classroom ExpectationsIn some respects college is much less formal than high school. One place this tends to bemost visible is in classroom expectations. Class attendance may not be as rigidly enforced as in high school Subjects which were avoided in
Iowa State - ENG - 160
ENGR 160: HW, Project, and Exam Preparation Guidance and Grading ProcedureGeneral Aspects of Presentation and LayoutAll HWs, projects, and exams need your name on the upper right corner,and your work should be organized, correctly laid out, with correc
Iowa State - ENG - 160
Engr 160/160H Reference SheetUnit Conversions: 2AREA - 1 acre = 43 560 ftAREA - 1 hectare = 10 000 m 2AREA - 1 square mile = 640 acreslbArea and Volume Equations:Triangle : A =kg1bh2()1b +b h221DENSITY - 1 m = 16.02 33ftmENERGY - 1 B
Iowa State - ENG - 160
ENGR_160_Fall_2011_Section_E1_Prof_AllemanClass Project @ 23 August 2011 (Tuesday)Step123445EffortMeet your new team members (NOTE: you will be told who is on your team for this effort)Teams will have 4 members each (although this may change)D
Iowa State - ENG - 160
Project 2011_10_04 MATLAB Program Application (General analysis PLUS Image Tool)1) Use Matlab to calculate the approximate daily water demand (in terms of both gallons and cubic meters volumes, as well as interms of both lbs and tons mass) within the ci
Iowa State - ENG - 160
Matlab BasicsScreenLayoutCommandWindow(enteringcommands)Workspace(trackingvariables)CommandHistory(trackingcommands)BasicEntering #sNOTE: when you do not specify a name for a variable, Matlab assumes a namecalled ansEntering #sw/ ;Disp
Iowa State - ENG - 160
ENGR_160_Fall_2011_Section_E1_Prof_AllemanClass Project @ 23 August 2011 (Tuesday)Step123445EffortMeet your new team members (NOTE: you will be told who is on your team for thiseffort)Teams will have 4 members each (although this may change)
Iowa State - ENG - 160
ENGR_160_Fall_2011_Section_E1_Prof_AllemanClass Project @ 25 August 2011 (Thursday)Learning Topic: Basic Spreadsheet UseStep12EffortConfirm your new permanent team memberYou will keep this team member for the duration of the semesterDownload the
Iowa State - ENG - 160
ENGR_160_Fall_2011_Section_E1_Prof_AllemanClass Project @ 30 August 2011 (Tuesday)Learning Topic:1) Additional Basic Spreadsheet Use2) Mathcad Use RE: entering values and unitsTeam Member Names:1) _2)_Step1234EffortDownload the CA_I5_traffi
Iowa State - ENG - 160
ENGR_160_Fall_2011_Section_E1_Prof_AllemanClass Project @ 1 September 2011 (Tuesday)Learning Topic:1) Learn how to create dual plots2) Learn how to revise plots using select dataTeam Member Names:1)_2)_Step1234EffortDownload the I5_data_29
Iowa State - ENG - 160
ENGR_160_Fall_2011_Section_E1_Prof_AllemanClass Project @ 6 September 2011 (Tuesday)Learning Topic:1) Learn how to use Excels TRENDLINE process2) Learn how to evaluate data using TRENDLINETeam Member Names:1)_2)_Step1234EffortDownload the
Iowa State - ENG - 160
ENGR_160_Fall_2011_Section_E1_Prof_AllemanClass Project # 6 @ 8 September 2011 (Thursday)Learning Topic:1) Learn how to work with Mathcads variables and arrays and matrixes2) Learn how to work with Mathcads plotting procedures3) Learn how to use Math
Iowa State - ENG - 160
ENGR_160_Fall_2011_Section_E1_Prof_AllemanClass Project # 6 @ 8 September 2011 (Thursday)Learning Topic:1) Learn how to use Mathcads Given-Find function to solve a complex set of multiplesimultaneous linear equations2) Learn how to use Mathcads matri
Iowa State - ENG - 160
VBA PROJECT ASSIGNMENT @ 22 September 2011Write a program to compute the customer billing amount (BILL) for a given energy usage(EUSE) in kilowatt-hours (kWh), where the charges are a certain rate (RATE) in $/kWhfor the first 100 kWh and are a differen
Iowa State - ENG - 160
ENGR 160 HOMEWORK ASSIGNMENT @ 29 September 2011Use any analytical tool you prefer to make the following determinations:123International Space Station water - two US astronaut at current ISSuse and cost- 10L/day water use per US astronaut- $48,000
UIllinois - CHEM - 236
Lectures, exams, homework solutions, and examples.
UIllinois - CHEM - 236
UIllinois - CHEM - 236
UIllinois - CHEM - 236
UIllinois - CHEM - 236
UIllinois - CHEM - 236
UIllinois - CHEM - 236
UIllinois - CHEM - 236
UIllinois - CHEM - 236
UIllinois - CHEM - 236
UIllinois - CHEM - 236
UIllinois - CHEM - 236
UIllinois - CHEM - 236
UIllinois - CHEM - 236
UIllinois - CHEM - 236
First Three Letters of Last NameNAME _Network ID _Section _CHEMISTRY 236SPRING 2011EXAM IIMARCH 4, 2011Note: The last page of this exam is a periodic table.1 (12)2 (12)3 (10)4 (12)5 (15)6 (15)7 (15Nomenclature and structure drawingR
UIllinois - CHEM - 236
Alcohols and Thiols asNucleophiles and ElectrophilesMonday, March 7, 2011
UIllinois - CHEM - 236
Oxidation and Reduction ofAlcoholsFriday, March 11, 2011
UIllinois - CHEM - 236
Oxidation of Thiols andReactions of Ethers and SulfidesMonday, March 14, 2011
UIllinois - CHEM - 236
Epoxides Synthesis and Use asElectrophilesWednesday, March 16, 2011
UIllinois - CHEM - 236
Chemistry of Diols andSynthesis OverviewFriday, March 18, 2011
UIllinois - CHEM - 236
Lecture 25: InfraredSpectroscopy Pt. IWednesday, March 30, 2011chloroformchloroform-dhexane1-hexene1-hexene3-hexene1-hexene1-hexyne1-hexyne2-hexyne
UIllinois - CHEM - 236
Lecture 26: InfraredSpectroscopy, Pt. II andIntro to NMRFriday, April 1, 2011hexane1-hexanol1-hexanol1-hexylamine1-hexanolhexanalhexanal2-hexanonehexanalmethylhexanoatemethylhexanoatehexanoic acid
UIllinois - CHEM - 236
UIllinois - CHEM - 236
UIllinois - CHEM - 236
UIllinois - CHEM - 236
UIllinois - CHEM - 236