Lecture2 - College of Information Technology Master Program...

Info iconThis preview shows pages 1–8. Sign up to view the full content.

View Full Document Right Arrow Icon
College of Information Technology Master Program in Scientific Computing Scientific Computing II (SCOM6301) Introduction to Parallel Computing Lecture 2 Parallel Computer Architectures Concepts and Terminology CH03
Background image of page 1

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
von Neumann Architecture For over 40 years, virtually all computers have followed a common machine model known as the von Neumann computer. Named after the Hungarian mathematician John von Neumann. A von Neumann computer uses the stored-program concept. The CPU executes a stored program that specifies a sequence of read and write operations on the memory.
Background image of page 2
Uniprocessor Model Processor names bytes, words, etc. in its address space These represent integers, floats, pointers, arrays, etc. Exist in the program stack, static region, or heap Operations include Read and write (given an address/pointer) Arithmetic and other logical operations Order specified by program Read returns the most recently written data Compiler and architecture translate high level expressions into “obvious” lower level instructions Hardware executes instructions in order specified by compiler Cost Each operations has roughly the same cost
Background image of page 3

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
4 Uniprocessors in the Real World Real processors have registers and caches small amounts of fast memory store values of recently used or nearby data different memory ops can have very different costs parallelism multiple “functional units” that can run in parallel different orders, instruction mixes have different costs pipelining a form of parallelism, like an assembly line in a factory In theory, compilers understand all of this and can optimize your program; in practice they don’t.
Background image of page 4
Background image of page 5

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Scope of Parallelism Processor, memory system, and the datapath present significant performance bottlenecks. Parallelism addresses each of these components in significant ways. Different applications utilize different aspects of parallelism - e.g., server applications utilize high aggregate network bandwidth, and scientific applications typically utilize high processing and memory system performance. It is important to understand each of these performance bottlenecks
Background image of page 6
Architectures Instruction Level Parallelism (ILP) Microprocessor clock speeds have posted impressive gains over the past decades. Higher levels of device integration have made available a large number of transistors. The question of how best to utilize these resources is an important one. Current processors use these resources in multiple functional units and execute multiple instructions in the same cycle. The precise manner in which these instructions are
Background image of page 7

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Image of page 8
This is the end of the preview. Sign up to access the rest of the document.

This note was uploaded on 09/24/2009 for the course CS 525 taught by Professor Rjyosy during the Winter '09 term at Central Mich..

Page1 / 32

Lecture2 - College of Information Technology Master Program...

This preview shows document pages 1 - 8. Sign up to view the full document.

View Full Document Right Arrow Icon
Ask a homework question - tutors are online