cpe631vector

# cpe631vector - CPE 631 Vector Processing(Appendix F in COA4...

This preview shows pages 1–8. Sign up to view the full content.

CPE 631: Vector Processing (Appendix F in COA4) Electrical and Computer Engineering University of Alabama in Huntsville Aleksandar Milenković [email protected] http://www.ece.uah.edu/~milenka

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document
2 AM L a CA S A Outline Properties of Vector Processing Components of a Vector Processor Vector Execution Time Real-world Problems: Vector Length and Stride Vector Optimizations: Chaining, Conditional Execution, Sparse Matrices
3 AM L a CA S A Why Vector Processors? Instruction level parallelism (Ch 3&4) Deeper pipeline and wider superscalar machines to extract more parallelism more register file ports, more registers, more hazard interlock logic In dynamically scheduled machines instruction window, reorder buffer, rename register files must grow to have enough capacity to keep relevant information about in-flight instructions Difficult to build machines supporting large number of in-flight instructions => limit the issue width and pipeline depths => limit the amount parallelism you can extract Commercial versions long before ILP machines

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document
4 AM L a CA S A Vector Processing Definitions Vector - a set of scalar data items, all of the same type, stored in memory Vector processor - an ensemble of hardware resources, including vector registers, functional pipelines, processing elements, and register counters for performing vector operations Vector processing occurs when arithmetic or logical operations are applied to vectors add r3, r1, r2 SCALAR (1 operation) + r1 r2 r3 vector length add.vv v3, v1, v2 VECTOR (N operations) + + + + + v1 v2 v3
5 AM L a CA S A Properties of Vector Processors 1) Single vector instruction specifies lots of work equivalent to executing an entire loop fewer instructions to fetch and decode 2) Computation of each result in the vector is independent of the computation of other results in the same vector deep pipeline without data hazards; high clock rate 3) Hw checks for data hazards only between vector instructions (once per vector, not per vector element) 4) Access memory with known pattern elements are all adjacent in memory => highly interleaved memory banks provides high bandw. access is initiated for entire vector => high memory latency is amortised (no data caches are needed) 5) Control hazards from the loop branches are reduced nonexistent for one vector instruction

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document
6 AM L a CA S A Properties of Vector Processors (cont’d) Vector operations: arithmetic (add, sub, mul, div), memory accesses, effective address calculations Multiple vector instructions can be in progress at the same time => more parallelism Applications to benefit Large scientific and engineering applications (car crash simulations, whether forecasting, …) Multimedia applications
7 AM L a CA S A Basic Vector Architectures Vector processor: ordinary pipelined scalar unit + vector unit Types of vector processors Memory-memory processors: all vector operations are memory-to-memory (CDC)

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document
This is the end of the preview. Sign up to access the rest of the document.

## This note was uploaded on 12/13/2011 for the course CPE 631 taught by Professor Staff during the Spring '10 term at University of Alabama - Huntsville.

### Page1 / 31

cpe631vector - CPE 631 Vector Processing(Appendix F in COA4...

This preview shows document pages 1 - 8. Sign up to view the full document.

View Full Document
Ask a homework question - tutors are online