A pipeline register between the cam and the ram

Info iconThis preview shows page 1. Sign up to view the full content.

View Full Document Right Arrow Icon
This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: the host computer at the other end of the RS232 serial line runs the ARM development tools. This system demonstrates that using an asynchronous processor need be no more difficult than using a clocked processor provided that the memory interface has been carefully thought out. chip selects 18 8 18 8 Figure 14.9 AMULET2e test card organization. AMULET3 387 14.5 AMULETS AMULETS is being developed to establish the commercial viability of asynchronous design. Like its predecessors, AMULETS is a full-functionality ARM-compatible microprocessor with support for interrupts and memory faults. AMULET 1 and AMULET2 implemented the ARM6 architecture (ARM architecture version 3G). AMULETS supports ARM architecture version 4T, including the 16-bit Thumb instruction set. Performance Objective The objective of the AMULETS project was to produce an asynchronous implementation of ARM architecture v4T which is competitive in terms of power-efficiency and performance with the ARM9TDMI. This implies a performance target of over 100 MIPS (measured using Dhrystone 2.1) on a 0.35 (im process, compared to the 40 MIPS delivered by AMULET2e on a 0.5 um process. Increasing the performance by more than a factor of two requires a radical change to the core organization. As with a clocked processor, the basic approach is based on a combination of increasing the cycle rate of the processor pipeline and decreasing the average number of cycles per instruction. Here, however, the 'cycles' are not defined by an external clock and are not of fixed duration. The organization of AMULETS is illustrated in Figure 14.10 on page 388. The six principal pipeline stages are as follows: the instruction prefetch unit, which includes a branch target buffer; the instruction decode, register read and forwarding stage; the execute stage, which includes the shifter, multiplier and ALU; the data memory interface; the reorder buffer; the register result write-back stage. The core employs a Harvard architecture (separate instructi...
View Full Document

This document was uploaded on 10/30/2011 for the course CSE 378 380 at SUNY Buffalo.

Ask a homework question - tutors are online