Compensation Code .
q But is A < B it will be done and will be wrong.
q The compiler could insert the following code at 10
q On the TRACE machine each functional unit is split into an integer ALU and a
floating point ALU.
q Each F
History of SIMD systems
SIMD Instruction Execution
q Each processing element performs the same operation, but they may be active or
q If they are inactive then they ignore the instruction. In this way the instruction may be
applied to certain da
q In a fully configured TRACE machine 4 memory references may be started in each
beat, to 4 independently generated addresses.
q The following rules must be followed:
At most one reference may be initiated on any one controller
No two refe
q Each PEM contains 2048 64 bit words of data. PEMi can only be addressed from
PEi. Thus a PE can only change data in its own PEM. Data can be passed from PE
to PE via the routine network.
q The control unit bus allows instructions to be fet
q The TRACE compiler must not only generate code for the VLIW machine
schedule the hardware resources statically at compile time.
This is not always possible!
q The compiler must schedule instructions so that as many ALUs are used as possib
q In a highly parallel program each instruction will be packed with useful instructions.
q In a program which does not have sufficient concurrency there will be many no-ops in
the fields of the instructions.
q Also, even highly parall
q Single stream of instructions
(one program counter and one control unit),
q Very long instruction format,
enough control bits to directly and independently control the action of every
functional unit in every cycle
q Large numbers of data
SIMD Recursive Doubling
1. Enable all PE's (Turn on all PE's)
2. All PE's load RGA from location B
3. i = 0
4. All PEs load RGR from their RGA
5. All PEs ROUTE their RGR contents 2i to the right.
6. j = 2i -1
7. Disable all PE's number
q The Trace machine uses compare-predict operations rather than test operators.
q The results can be written to general registers, which can avoid some of the
branches in complex IF chains.
CEQ R1,R2, BB(R2)
Write BB with 1