Ultrascalable Implicit Finite
Element Analyses in Solid
Mechanics with over a Half a
Billion Degrees of Freedom
(excerpts)
Mark F. Adams
H.H. Bayraktar, T.M. Keaveny, P. Papadopoulos and Atul Gupta
1
Trabecular Bone
Cortical
bone
Trabecular
bone
2
5-mm Cu

Computation on
Computation
meshes,
meshes,
sparse matrices,
and graphs
and
Some slides are from David Culler,
Jim Demmel, Bob Lucas, Horst Simon,
Kathy Yelick, et al., UCB CS267
Parallelizing Stencil Computations
Parallelism is simple
Grid is a regular

Computation on
Computation
meshes,
meshes,
sparse matrices,
and graphs
and
Some slides are from David Culler,
Jim Demmel, Bob Lucas, Horst Simon,
Kathy Yelick, et al., UCB CS267
Parallelizing Stencil Computations
Parallelism is simple
Grid is a regular

Breadth First Search
2
s
4
5
3
8
7
6
9
1
Breadth First Search
Shortest path
from s
0
1
2
s
4
5
3
8
7
6
9
Undiscovered
Discovered
Queue: s
Top of queue
Finished
2
Breadth First Search
1
2
0
s
4
5
3
8
7
6
9
1
Undiscovered
Discovered
Queue: s 2
Top of queue

CS 140 Assignment 3:
Simulating the N-body Problem
Assigned January 19, 2010
Due by 11:59 pm Monday, January 31
This assignment is to write an MPI program to simulate a large number of astronomical bodies
(stars and planets) moving under the inuence of gr

CS 140 Assignment 5:
NFA Based String Matching
Assigned February 10, 2010
Due by 11:59 pm Wednesday, February 24
The purpose of this assignment is for you to gain experience in a common real-world
scenario: You are given an existing sequential program, an

CS 140 Assignment 5:
Life in the Fast Lane
Assigned February 7, 2010
Due by 11:59 pm Friday, February 18
The object of this problem is to implement a cellular automaton called the Game of Life in
Cilk+, and to tune it to get maximum performance. The progr

CS 140 Midterm 1 - 5 February 2009
Problem 1 [20 points total] Each of p processors starts out with the coordinates (x, y) of a single point in the plane. Our goal is to compute the center of gravity (cx, cy) of the p points, and the average distance avgd

CS 140 Midterm 2 - 3 March 2010
Name
Perm#
Problem 1 [20 points total] This problem is about maximal
independent sets in the graph shown at right, which has 12 vertices
connected in a cycle with 12 edges. For each part, you are to first
identify a particu

TACC/NPACI IBM Regatta-HPC (Power4) Overview
Chona Guiang, Kent Milfeld, Avi Purkayastha and Jay Boisseau August 21, 2002
Texas Advanced Computing Center
The University of Texas at
Background: TACC As An NPACI Resource Partner The Texas Advanced Computing

CS 140 Sample Midterm 2 Questions - March 2010 You may use your textbook and notes, but no other books or computers.
Problem 1 [30 points total]: Short answers. (1a) [10 points] Draw a graph that has two maximal independents sets of different sizes, and i

CS 140 : Non-numerical Examples with Cilk+
Divide and conquer paradigm for Cilk+ Quicksort Mergesort
Thanks to Charles E. Leiserson for some of these slides
1
Work and Span (Recap)
TP = execution time on P processors T1 = work T = span*
Speedup on p proc

CS 140 Assignment 4:
Cilkified Inner Products
Assigned February 3, 2010
Due by 11:59 pm Wednesday, February 10
The purpose of this assignment is to gain familiarity with Cilk+ constructs and tools, as
well as to think about different ways of parallelizing

CS 140 Assignment 4:
Cilkified Inner Products
Assigned February 3, 2010
Due by 11:59 pm Wednesday, February 10
The purpose of this assignment is to gain familiarity with Cilk+ constructs and tools, as
well as to think about different ways of parallelizing

Complexity measures for parallel computation Complexity
Problem parameters: n index of problem size p number of processors Algorithm parameters:
p
t
1
running time on p processors
t
time on 1 processor = sequential time = work
t v time on unlimited pro

Complexity Measures
for
Parallel Computation
Several possible models!
Several
Execution time and parallelism:
Work / Span Model
Total cost of moving data:
Communication Volume Model
Detailed models that try to capture time for moving data:
Latency /

A graph problem: Maximal Independent Set
graph
Graph with vertices V = cfw_1,2,n
A set S of vertices is independent if no
two vertices in S are neighbors.
An independent set S is maximal if it is
impossible to add another vertex and
stay independent
5

CS 140 : Matrix multiplication
CS
Matrix multiplication I : parallel issues
Matrix multiplication II: cache issues
Thanks to Jim Demmel and Kathy Yelick (UCB) for some of these slides
Communication volume model
Communication
Network of p processors
Ea

CS 140 : Numerical Examples on Shared Memory with Cilk+
Matrix-matrix multiplication Matrix-vector multiplication Hyperobjects
Thanks to Charles E. Leiserson for some of these slides
1
Work and Span (Recap)
TP = execution time on P processors T1 = work T