In time 1 n memory readwrite collisions must be

Info iconThis preview shows page 1. Sign up to view the full content.

View Full Document Right Arrow Icon
This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: mance •  more portable; leave low-level details to the compiler NAS MG: Operational View •  Data structures: –  3D arrays & 3D hierarchical arrays (2D in my pictures) –  3D sparse arrays can also be useful •  4 primary kernels: –  each computed using 27-point stencils •  •  •  •  resid: compute residual psinv: compute approximate inverse rprj3: projection from fine grid to coarse interp: interpolation from coarse grid to fine –  periodic boundary conditions •  computation of approximate norms –  norm2u3: approximate L2 & uniform norms •  initialization, output NAS MG: Parallel Implementation •  Arrays typically use block distributions –  good load balance (computation is homogenous) –  ghost cells allocated for caching neighbors’ values •  Communication Idioms: –  4 kernels require point-to-point communication –  toroidal communication required for boundaries –  global reductions required to compute norms –  reductions useful during initialization as well Q: In a Shared- Memory seXng, which would you use from the perspec3ve of memory? Reduces opportunity for false sharing CSEP 524: Parallel ComputaIon Winter 2013: Chamberlain 53 Q: In the seXng of MG, which would you use? Best surface to volume raIo (good for stencil computaIons) CSEP 524: Parallel ComputaIon Winter 2013:...
View Full Document

This document was uploaded on 04/04/2014.

Ask a homework question - tutors are online