MidtermSpring2011 - EEL 4930/5934 Reconfigurable Computing...

Info iconThis preview shows pages 1–4. Sign up to view the full content.

View Full Document Right Arrow Icon
EEL 4930/5934 Reconfigurable Computing Midterm Exam – Spring Semester 2011 Name ___________________________________ 1 1. Scalable Systolic Array paper The figure on the right is reproduced from the Scalable Systolic Array paper. It shows a scoring matrix for the Needleman-Wunsch algorithm in which the query sequence of length “m” on top (ATA … A in this example) is matched against a sequence database of length “n” on the left (AGG…C in this example) based on a dynamic programming method. Recall that, in hardware, a systolic array architecture was used in which one processing element (PE) is used to process one column of the scoring matrix, using a wavefront approach. (a) What is “dynamic programming” (2 pts) (b) How many PEs are used? ________________ (2 pts) (c) At the first clock cycle, which cell(s) of the array is(are) processed? ________________ At the second clock cycle, which cell(s) of the array is(are) processed? ________________ At the third clock cycle, which cell(s) of the array is(are) processed? ________________ At which clock cycle will all PEs be processing? _________________ (4 pts for (c) 3. DIMEtalk questions: (d) In terms of the C code, what is the difference between DIMEtalk BRAM and a BRAM produced by hand-written VHDL (or CoreGen)? (3 pts) DIMEtalk BRAM: VHDL BRAM: (e) Briefly compare (i.e., similarity and difference) of a DIMEtalk BRAM and a memory map. (3 pts) 14 pts. (e.g., AG-9)
Background image of page 1

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
EEL 4930/5934 Reconfigurable Computing Midterm Exam – Spring Semester 2011 Name ___________________________________ 2 2. Smart Buffer Design a smart buffer to implement the following algorithm (pseudo code). The datapath will unroll 4 loops at a time , thus requiring a(i), a(i-1), …, a(i-7) for every clock cycle to generate y(i), y(i+1), …, y(i+3). for (i=4; i < MAX; i++) { y[i] = a[i] + a[i-1]) + (a[i-2] + a[i-3] + a[i-4]); } Assumptions : Input memory bandwidth is 32 bits. Data item are 8 bits. However, bandwidth into the datapath needs to be 64 48 bits (enough for a(i) … a(i-7)) You don’t have to worry about the output memory bandwidth. addr 6 5 4 3 2 1 0 Input BRAM etc. etc. Smart Buffer 32 48 Datapath (b) Specify the contents of the smart buffer after each of the following clock cycle. after clock cycle 1 after clock cycle 2 after clock cycle 3 after clock cycle 4 (a) Fill in the BRAM appropriately with a(0), a(1), … , a(19). 16 pts.
Background image of page 2
EEL 4930/5934 Reconfigurable Computing Midterm Exam – Spring Semester 2011 Name ___________________________________ 3 3. Systolic Architecture ( a) Given the following algorithm in pseudo-code, draw one iteration of the datapath that is fully pipelned. (8 pts) for (i=0; i < 10000; i++) { if (a[i] < a[i+1]) z[i] = avg(a[i], a[i+1], a[i+2], a[i+3]); else z[i] = (a[i] + a[i+1]) * (a[i+2] + a[i+3]); } (b) Assume the input data items are 8 bits and input memory bandwidth is 64 bits; output data items are 16 bits and output memory bandwidth is 48 bits; all operators (+, -, /) have the same latency. What is the maximum number of loop-unrolling? (For credit, please show work.)
Background image of page 3

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Image of page 4
This is the end of the preview. Sign up to access the rest of the document.

This note was uploaded on 03/27/2012 for the course EEL 4930 taught by Professor Staff during the Spring '08 term at University of Florida.

Page1 / 10

MidtermSpring2011 - EEL 4930/5934 Reconfigurable Computing...

This preview shows document pages 1 - 4. Sign up to view the full document.

View Full Document Right Arrow Icon
Ask a homework question - tutors are online