lect16-multiple-threads

lect16-multiple-threads - Executing Multiple Threads, ECE...

Info iconThis preview shows pages 1–3. Sign up to view the full content.

View Full Document Right Arrow Icon
Executing Multiple Threads, ECE 752 Mikko H. Lipasti 1 Executing Multiple Threads Prof. Mikko H. Lipasti University of Wisconsin Madison Readings Read on your own: Shen & Lipasti Chapter 11 G. S. Sohi, S. E. Breach and T.N. Vijaykumar. Multiscalar Processors, Proc. 22nd Annual International Symposium on Computer Architecture, June 1995. Dean M. Tullsen, Susan J. Eggers, Joel S. Emer, Henry M. Levy, Jack L. Lo, and Rebecca L. Stamm. Exploiting Choice: Instruction Fetch and Issue on an Implementable Simultaneous Multithreading Processor, Proc. 23rd Annual International Symposium on Computer Architecture, May 1996 (B5) To be discussed in class: Poonacha Kongetira, Kathirgamar Aingaran, Kunle Olukotun, Niagara: A 32 Way Multithreaded Sparc Processor, IEEE Micro, March April 2005, pp. 21 29. Executing Multiple Threads Thread level parallelism Synchronization Multiprocessors Explicit multithreading Implicit multithreading: Multiscalar Niagara case study Thread level Parallelism Instruction level parallelism Reaps performance by finding independent work in a single thread Thread level parallelism Reaps performance by finding independent work across multiple threads Historically, requires explicitly parallel workloads Originate from mainframe time sharing workloads Even then, CPU speed >> I/O speed Had to overlap I/O latency with “something else” for the CPU to do Hence, operating system would schedule other tasks/processes/threads that were “time sharing” the CPU Thread level Parallelism CPU1 CPU1 CPU2 Disk access Disk access CPU1 CPU2 Think time Think time Single user: CPU1 Disk access Think time Increase in number of active threads reduces effectiveness of spatial locality by increasing working set. Time-shared: Reduces effectiveness of temporal and spatial locality CPU3 Disk access CPU3 Think time Time dilation of each thread reduces effectiveness of temporal locality. Thread level Parallelism Initially motivated by time sharing of single CPU OS, applications written to be multithreaded Quickly led to adoption of multiple CPUs in a single system Enabled scalable product line from entry level single CPU systems to high end multiple CPU systems Same applications, OS, run seamlessly Adding CPUs increases throughput (performance) More recently: Multiple threads per processor core Coarse grained multithreading (aka “switch on event”) Fine grained multithreading Simultaneous multithreading Multiple processor cores per die Chip multiprocessors (CMP) Chip multithreading (CMT)
Background image of page 1

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Executing Multiple Threads, ECE 752 Mikko H. Lipasti 2 Thread level Parallelism Parallelism limited by sharing Amdahl’s law: Access to shared state must be serialized Serial portion limits parallel speedup Many important applications share (lots of) state
Background image of page 2
Image of page 3
This is the end of the preview. Sign up to access the rest of the document.

Page1 / 10

lect16-multiple-threads - Executing Multiple Threads, ECE...

This preview shows document pages 1 - 3. Sign up to view the full document.

View Full Document Right Arrow Icon
Ask a homework question - tutors are online