{[ promptMessage ]}

Bookmark it

{[ promptMessage ]}

Open MPI_ a flexible high performance

Open MPI_ a flexible high performance - Open MPI A Flexible...

Info iconThis preview shows pages 1–3. Sign up to view the full content.

View Full Document Right Arrow Icon
Open MPI: A Flexible High Performance MPI Richard L. Graham 1 , Timothy S. Woodall 1 , and Jeffrey M. Squyres 2 1 Advanced Computing Laboratory, Los Alamos National Lab { rlgraham,twoodall } @lanl.gov 2 Open System Laboratory, Indiana University [email protected] Abstract. A large number of MPI implementations are currently avail- able, each of which emphasize different aspects of high-performance com- puting or are intended to solve a specific research problem. The result is a myriad of incompatible MPI implementations, all of which require sepa- rate installation, and the combination of which present significant logisti- cal challenges for end users. Building upon prior research, and influenced by experience gained from the code bases of the LAM/MPI, LA-MPI, FT-MPI, and PACX-MPI projects, Open MPI is an all-new, production- quality MPI-2 implementation that is fundamentally centered around component concepts. Open MPI provides a unique combination of novel features previously unavailable in an open-source, production-quality im- plementation of MPI. Its component architecture provides both a stable platform for third-party research as well as enabling the run-time compo- sition of independent software add-ons. This paper presents a high-level overview the goals, design, and implementation of Open MPI, as well as performance results for it’s point-to-point implementation. 1 Introduction The face of high-performance computer systems landscape is changing rapidly, with systems comprised of thousands to hundreds of thousands of processors in use today. These systems vary from tightly integrated high end systems, to clusters of PCs and workstations. Grid and meta computing add twists such as a changing computing environment, computing across authentication domains, and non-uniform computing facilities, such as variations in processor type and bandwidths and latencies between processors. This wide variety of platforms and environments poses many challenges for a production-grade, high performance, general purpose MPI implementation, requires it to provide a high degree of flexibility in many problem axes. One needs to provide tunable support for the traditional high performance, scalable communications algorithms, as well as address a variety of failure scenarios. In addition items such as process control, resource exhaustion, latency awareness and management, fault tolerance, and optimized collective operations for com- mon communication patterns, need to be dealt with. These types of issues have addressed in one way of another by different projects, but little attention has been given to dealing with various fault scenar- ios. In particular, network layer transmission errors—which have been considered
Background image of page 1

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full Document Right Arrow Icon
highly improbable for moderate-sized clusters—cannot be ignored when dealing with large-scale computations [4]. This is particularly true when O/S bypass protocols are used for high performance messaging on systems that do not have end-to-end hardware data integrity. In addition, the probability that a parallel
Background image of page 2
Image of page 3
This is the end of the preview. Sign up to access the rest of the document.

{[ snackBarMessage ]}