This preview has intentionally blurred sections. Sign up to view the full version.View Full Document
Unformatted text preview: 1 CMSC714 Defect Patterns in High Performance Computing Taiga Nakamura (edits by A. Sussman) University of Maryland 2 CMSC714 Notes • MPI project posted, due Wed., Sept. 28, 6PM, via email – bug cluster account questions? • Send questions for readings, starting Tuesday – additional readings posted soon 3 CMSC714 Background • Debugging and testing parallel code is hard – How can bugs be prevented or found/fixed effectively? • “Knowing” common defects (bugs) will reduce the time spent debugging – Novice developers can learn how to detect/prevent them – Someone may develop tools and/or improve language • HPCS project built “Defect patterns” for high performance programming (HPC) – Based on the empirical data collected in various studies – Examples in this presentation are shown in C + MPI (Message Passing Interface) 4 CMSC714 Differentiating Factors of HPC • Platform : Computational power of today's HPC systems is achieved by massively parallel systems. Writing a scalable program on these systems is difficult. • Performance : Slow execution speed can be a defect even if the output is correct. Achieving good performance on multiple processors is often difficult • Language : Developers usually use special HPC languages and libraries (MPI, OpenMP, UPC, CAF, Titanium, ...), each with their own ways of handling issues such as communication and synchronization. SPMD (Single Program, Multiple Data) approach is dominant • Developers : Software often developed by scientists and grad students without formal training in software engineering. Traditional software engineering processes or practices are not necessarily used in HPC projects • Tools : The use of modern tools (IDEs, graphical debuggers, defect detection tools, profiling tools, etc.) is not as common as in other domains • Portability : Portability is very important for HPC applications since they must be run on various platforms depending on the computational resources available • Validation : Given the nature of HPC applications, the correct outputs are not always known, so debugging is particularly challenging and costly. 5 CMSC714 Example Problem • Consider the following problem: 1. N cells, each of which holds an integer [0..9] • E.g., cell=2, cell=1, …, cell[N-1]=3 2. In each step, cells are updated using the values of neighboring cells • cell next [x] = (cell[x-1] + cell[x+1]) mod 10 • cell next =(3+1), cell next =(2+6), … • (Assume the last cell is adjacent to the first cell) 3. Repeat 2 for steps times A sequence of N cells 2 1 6 8 7 1 0 2 4 5 1 … 3 What defects can appear when implementing a parallel solution in MPI? 6 CMSC714 First, Sequential Solution • Approach to implementation – Use an integer array buffer to represent the cell values – Use a second array nextbuffer to store the values in the next step, and swap the buffers – Straightforward implementation!...
View Full Document
This note was uploaded on 01/12/2012 for the course CMSC 714 taught by Professor Staff during the Fall '07 term at Maryland.
- Fall '07