This preview shows page 1. Sign up to view the full content.
Unformatted text preview: CSE 721 Programming Assignment 1 Due 2/28/2011, 3:00PM For this assignment, you are to create versions of the following two codes using SSE intrinsics. Submit via Carmen; be sure to include source code listings and report on performance. A single file must be uploaded to Carmen for the assignment, which includes the written report as well as code listings and output from execution of the programs. 1. (35 points) The following code implements a hardwired matrixmatrix product for 4x4 singleprecision floatingpoint matrices. In some computational domains such as QCD (Quantum Chromo Dynamics), a large number of such matrixproducts of small fixedsize matrices is required. The performance of math library matrix multiplication routines is generally not optimized for such small matrix sizes. Hence SSEbased specialized codes are used. void mul4x4(float *A,float *B, float *C) { int i,j,k; for(i=0;i<4;i++) { for(j=0;j<4;j++) C[4*i+j] = 0.0; for(k=0;k<4;k++) for(j=0;j<4;j++) C[4*i+j] += A[4*i+k]*B[4*k+j];...
View Full
Document
 Winter '11
 Saday

Click to edit the document details