This preview shows page 1. Sign up to view the full content.
Unformatted text preview: Exercise 1 Consider a hypothetical 32-bit microprocessor having 32-bit instructions composed of two fields: the first byte contains the op code and the remainder the immediate operand or an operand address. 1) What is the maximum directly addressable memory capacity (in bytes)? 2) Discuss the impact on the system speed if the microprocessor bus has a) a 32-bit local address bus and a 16-bit local data bus, or b) a 16-bit local address bus and a 16-bit local data bus. 3) How many bits are needed for the program counter and the instruction register? Solution 1) 224 = 16 MBytes 2) a) If the local address bus is 32 bits, the whole address can be transferred at once and decoded in memory. However, because the data bus is only 16 bits, it will require 2 cycles to fetch a 32-bit instruction or operand. b) The 16 bits of the address placed on the address bus can't access the whole memory. Thus a more complex memory interface control is needed to latch the first part of the address and then the second part (because the microprocessor will end in two steps). For a 32-bit address, one may assume the first half will decode to access a "row" in memory, while the second half is sent later to access a "column" in memory. In addition to the two-step address operation, the microprocessor will need 2 cycles to fetch the 32 bit instruction/operand. 3) The program counter must be at least 24 bits. Typically, a 32-bit microprocessor will have a 32-bit external address bus and a 32-bit program counter, unless on-chip segment registers are used that may work with a smaller program counter. If the instruction register is to contain the whole instruction, it will have to be 32-bits long; if it will contain only the op code (called the op code register) then it will have to be 8 bits long. You are going to enhance a computer, and there are two possible improvements: either make multiply instructions run four times faster than before, or make memory access instructions run two times faster than before. You repeatedly run a program that takes 100 seconds to execute. Of this time, 20% is used for multiplication, 50% for memory access instructions, and 30% for other tasks. What will the speedup be if you improve only multiplication? What will the speedup be if you improve only memory access? What will the speedup be if both improvements are made? Exercise 3: How would you build a 4K 8 memory using the available 1K 1 memory chips? Each such memory chip has one line for data input, one line for data output, a chip selection (CS) line and a set of 10 address lines. Each chip can be represented by a box with these input/output lines. You need to show how to form the 4K 8 memory by connecting as many as necessary such chips with minimum additional logic (multiplexers, decoders, etc.). Exercise 4 Consider a 256K x 1 static RAM chip. a) How many rows and columns in the cellarray? b) How many address bits are input to the row decoder? c) How many address bits are sent to the multiplexor? Solution
a) 256K = 2 , 1 = 2 . Total bits = 218*20 = 218. 18 9 sqrt(2 ) = 2 = 512. Hence there are 512 rows and columns to make it square. b) 9 c) 9
18 0 Exercise 5
Consider the following code: for (i=0; i<20; i++) for (j=1; j<=10; j++) a[i] = a[i]*j; a) Give one example of spatial locality in the code. b) Give one example of temporal locality in the code. Solution
a) An example of spatial locality is the access of a after accessing a. b) An example of temporal locality is the write access of a after the read access to a. Exercise 6
Average memory access time is the average time to access memory considering both hits and misses and the frequency of different accesses; it is equal to the following: AMAT = Time for_a_hit + Miss_rate Miss_penalty
AMAT is useful as a figure of merit for different cache systems. Find the AMAT for a processor with a 2 ns clock, a miss penalty of 20 clock cycles, a miss rate of 0.05 misses per instruction, and a cache access time (including hit detection) of 1 clock cycle. Elarbiemail@example.com Tel: 2582137 Suppose we can improve the miss rate to 0.03 misses per reference by doubling the cache size. This causes the cache access time to increase to 1.2 clock cycles. Using the AMAT as a metric, determine if this is a good trade-off. ...
View Full Document
This note was uploaded on 05/04/2010 for the course CS 333 taught by Professor Alarabi during the Spring '10 term at DeVry Cleveland D..
- Spring '10