Lecture 17 Irregular problems (II)

11/17/06 Scott B. Baden / CSE 260 / Fall 2006 4 Partitioning How do we divide up the computation and assign to processors? The process is called decomposition or partitioning Related issue: processor mapping
11/17/06 Scott B. Baden / CSE 260 / Fall 2006 5 Unstructured meshes • “Unstructured” meshes are another type of non- uniformly spaced mesh • Useful when the boundary or surface of the object is complicated Randy Bank, UCSD

11/17/06 Scott B. Baden / CSE 260 / Fall 2006 6 A typical irregular mesh sweep loop float x[n_node], y[n_node] int E0[n_edge], E1[n_edge] for i = 1 : n_edge // Loop over all edges int n0 = E0[i] int n1 = E1[i] y[n0] += f(x[n0], x[n1]); y[n1] += g(x[n0], x[n1]); end for i=1 n0 n1
11/17/06 Scott B. Baden / CSE 260 / Fall 2006 7 Run time support for unstructured meshes • Data dependences across processor boundaries • Irregular communication patterns

11/17/06 Scott B. Baden / CSE 260 / Fall 2006 9 Implementation issues • Because the mesh does not have a regular structure, we must keep processor mapping information for each point in the mesh • Compare with a uniform mesh, in which the mapping information is much coarser grained: mapping is handled at the level of a subdomain comprising many points
11/17/06 Scott B. Baden / CSE 260 / Fall 2006 10 Requirements • We don’t know the assignment of nodes to processors until run time • Assume that the mesh doesn’t change once initialized • We need to include ghost cells, and want to keep the same loop structure • We execute loops using local indices

11/17/06 Scott B. Baden / CSE 260 / Fall 2006 11 Processor Mapping • Invoke a partitioner to map global indices to (processor, local index) pairs
11/17/06 Scott B. Baden / CSE 260 / Fall 2006 12 Global translation • Each global index is mapped to a processor and a local index • A global translation table expresses this mapping Global Local

Scott B. Baden / CSE 260 / Fall 2006 13 Global translation • Because we require mapping information for each individual mesh node, we cannot fit it in the memory of a single processor
