Lecture09a

Lecture09a - 0306-381 Applied Programming • Fixed-Point...

Info iconThis preview shows page 1. Sign up to view the full content.

View Full Document Right Arrow Icon
This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: 0306-381 Applied Programming • Fixed-Point (Q Number) Arithmetic • Numerical Derivative Q-Format Characteristics • Data type to meet needs of application • Range (QQ in N-bit word) – Signed: [-2N-1-Q, 2N-1-Q - 2-Q] -Q N-Q – Unsigned: [0, 2 - 2 ] – Smallest possible value – 2-Q – Uniform across entire range • Resolution Unlike FP 2 Q-Format Conversion: from Float Convert from float to QQ – Need to multiply floating-point value by 2Q and store in an N-bit integer – C code float NumFloat; QQ NumQQ; NumQQ = (QQ) (NumFloat * (float) (1 << Q)); – C macro #define FLOAT_TO_QQ(X) ((QQ) (X * (float) (1 << Q))) 3 Q-Format Conversion: to Float Convert from QQ to float – Need to divide N-bit integer value by 2Q and store in a floating-point number, making sure Q-bit fractional result is kept – C code float NumFloat; NumQQ; QQ NumFloat = (float) NumQQ / (float) (1 << Q)); – C macro #define QQ_TO_FLOAT(X) ((float) X / (float) (1 << Q)) 4 Q-Format Addition Add two QQ numbers: C = A + B – Need to add two N-bit integers and store result in an N-bit integer – C code QQ A, B, C; C = A + B; – C macro #define QQ_ADD(A,B) ((A) + (B)) 5 Q-Format Subtraction Subtract two QQ numbers: C = A - B – Need to subtract two N-bit integers and store result in N-bit integer – C code QQ A, B, C; C = A - B; – C macro #define QQ_SUBTRACT(A,B) ((A) - (B)) 6 Multiply two QQ numbers: C = A ´ B Q-Format Multiplication – Need to multiply two N-bit integers and store result in N-bit integer with correct Q bits – C code QQ A, B, C; C = (A * B) >> Q; – C macro #define QQ_MULTIPLY(A,B) ((A) * (B) >> Q) Problem: If QQ word size (N) is largest machine word – Multiplication result overflows Solution: Perform half-Q shift of each factor first – C macro #define QQ_MULTIPLY(A,B) ((A>>(Q/2))*(B >>(Q-(Q/2)))) – Less accurate than first version (if useable) 7 Divide two QQ numbers: C = A ¸ B Q-Format Division – Need to divide two N-bit integers and store result in N-bit integer with correct Q-bits – C code QQ A, B, C; C = (A << Q) / B); – C macro #define QQ_DIVIDE(A,B) (((A)<<Q)/(B)) Problem: Dividend must be shifted left Q before division Solution: Typecast intermediate result to larger word – C macro #define QQ_DIVIDE(A,B) ((QQ) (((long) (A)<<Q)/(B))) 8 Fixed-Point Addition Example N=16; Q=6 Add 64.12510 and -.7510 Binary 1000000.001 + -.11 ------------111111.011 Decimal 64.125 + -.75 --------63.375 0001000000001000 +(-0000000000110000) -------------------Q6 (N16) 0001000000001000 +1111111111010000 ----------------0000111111011000 9 Fixed-Point Multiplication Example N=16; Q=6 Q6 (N16) Binary 0001000000001000 1000000.001 0001000000001000 x(-0000000000110000) x -.11 x1111111111010000 ------------------------------------------------10000.00001 [Rearrange to simplify hand calculation—not part of algorithm.] 1111111111010000 -100000.0001 x0001000000001000 ---------------------------------------------110000.00011 Sign-extend to 11111111111111111111111010000000 Decimal 2N bits for 64.125 operation. 11111111111111010000 -------------------------------x -.75 11111111111111001111111010000000 --------Drop N-Q bits on left (whole). Drop Q bits on right (fractional). -3.20625 -44.8875 -----------(0000110000000110) -(1111001111111010) -48.09375 10 Multiply 64.12510 by -.7510 ...
View Full Document

This note was uploaded on 04/27/2010 for the course EECC 0306-381 taught by Professor Roymelton during the Spring '10 term at RIT.

Ask a homework question - tutors are online