Yelick_lecture10_upc_kay11

Yelick_lecture10_upc_kay11 - CS267 Lecture UPC 1 CS 267...

Info iconThis preview shows pages 1–8. Sign up to view the full content.

View Full Document Right Arrow Icon

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: 12/27/11 CS267 Lecture: UPC 1 CS 267 Unified Parallel C (UPC) Kathy Yelick http://upc.lbl.gov http://upc.gwu.edu What’s Wrong with MPI Everywhere • We can run 1 MPI process per core (“flat MPI”)- This works now on dual and quad-core machines- It will work on 12-24 core machines like Hopper as well • What are the problems?- Latency: some copying required by semantics- Memory utilization: partitioning data for separate address space requires some replication • How big is your per core subgrid? At 10x10x10, over 1/2 of the points are surface points, probably replicated • Weak scaling: success model for the “cluster era;” will not be for the many core era -- not enough memory per core- Heterogeneity: MPI per CUDA thread-block? • Approaches- MPI + X, where X is OpenMP, Pthreads, OpenCL, TBB,…- A PGAS language like UPC, Co-Array Fortran, Chapel or Titanium 12/27/11 2 Cray XE Training 12/27/11 Cray XE Training 3 PGAS Languages • Global address space: thread may directly read/write remote data • Hides the distinction between shared/distributed memory • Partitioned: data is designated as local or global • Does not hide this: critical for locality and scaling Global address space x: 1 y: l: l: l: g: g: g: x: 5 y: x: 7 y: 0 p0 p1 pn • UPC, CAF, Titanium: Static parallelism (1 thread per proc) • Does not virtualize processors • X10, Chapel and Fortress : PGAS,but not static (dynamic threads) 12/27/11 CS267 Lecture: UPC 4 UPC Outline 1. Background 2. UPC Execution Model 3. Basic Memory Model: Shared vs. Private Scalars 4. Synchronization 5. Collectives 6. Data and Pointers 7. Dynamic Memory Management 8. Performance 9. Beyond UPC 12/27/11 CS267 Lecture: UPC 5 Context • Most parallel programs are written using either:- Message passing with a SPMD model (MPI) • Scales easily on clusters- Shared memory with threads in OpenMP, Threads • In practice, requires shared memory hardware • Partitioned Global Address Space (PGAS) Languages take the best of both:- Global address space like threads (programmability)- SPMD parallelism like most MPI programs (performance)- Local/global distinction, i.e., layout matters (performance) History of UPC • Initial Tech. Report from IDA in collaboration with LLNL and UCB in May 1999 (led by IDA).- Based on Split-C (UCB), AC (IDA) and PCP (LLNL) • UPC consortium participants (past and present) are: - ARSC, Compaq, CSC, Cray Inc., Etnus, GMU, HP, IDA CCS, Intrepid Technologies, LBNL, LLNL, MTU, NSA, SGI, Sun Microsystems, UCB, U. Florida, US DOD- UPC is a community effort, well beyond UCB/LBNL • Design goals: high performance, expressive, consistent with C goals, …, portable • UPC Today- Multiple vendor and open compilers (Cray, HP, IBM, SGI, gcc-upc from Intrepid, Berkeley UPC)- “Pseudo standard” by moving into gcc trunk- Most widely used on irregular / graph problems today 12/27/11 CS267 Lecture: UPC 6 12/27/11 CS267 Lecture: UPC 7 PGAS Languages • Global address space: thread may directly read/write remote data...
View Full Document

This note was uploaded on 12/27/2011 for the course CMPSC 240A taught by Professor Gilbert during the Fall '09 term at UCSB.

Page1 / 69

Yelick_lecture10_upc_kay11 - CS267 Lecture UPC 1 CS 267...

This preview shows document pages 1 - 8. Sign up to view the full document.

View Full Document Right Arrow Icon
Ask a homework question - tutors are online