161202-ForkJoinParallelism.pdf

# If you have studied combinational hardware circuits

• Notes
• 34

This preview shows pages 26–29. Sign up to view the full content.

If you have studied combinational hardware circuits, this model is strikingly similar to the dags that arise in that setting. For circuits, work is typically called the size of the circuit, (i.e., the amount of hardware) and span is typically called the depth of the circuit, (i.e., the time, in units of “gate delay,” to produce an answer). With basic fork-join divide-and-conquer parallelism, the execution dags are quite sim- ple: The O ( 1 ) work to set up two smaller subproblems is one node in the dag. This node has two outgoing edges to two new nodes that start doing the two subproblems. (The fact that one subproblem might be done by the same thread is not relevant here. Nodes are not threads. They are O ( 1 ) pieces of work.) The two subproblems will lead to their own dags. When we join on the results of the subproblems, that creates a node with incoming edges from the last nodes for the subproblems. This same node can do an O ( 1 ) amount of work to combine the results. (If combining results is more expensive, then it needs to be represented by more nodes.) Overall, then, the dag for a basic parallel reduction would look like this: CPEN 221 – Fall 2016

This preview has intentionally blurred sections. Sign up to view the full version.

Fork-Join Parallelism 27 Figure 2: An example dag and the path (see thicker blue arrows) that determines its span. Figure 3: Simple parallel reduction and combination CPEN 221 – Fall 2016
Fork-Join Parallelism 28 The root node represents the computation that divides the array into two equal halves. The bottom node represents the computation that adds together the two sums from the halves to produce the final answer. The base cases represent reading from a one- element range assuming no sequential cut-off. A sequential cut-off “just” trims out levels of the dag, which removes most of the nodes but affects the dag’s longest path by “only” a constant amount. Note that this dag is a conceptual description of how a program executes; the dag is not a data structure that gets built by the program. From the picture, it is clear that a parallel reduction is basically described by two bal- anced binary trees whose size is proportional to the input data size. Therefore T 1 is O ( n ) (there are approximately 2 n nodes) and T is O ( log n ) (the height of each tree is approximately log n ). For the particular reduction we have been studying — summing an array — the following figure visually depicts the work being done for an example with 8 elements. The work in the nodes in the top half is to create two subproblems. The work in the nodes in the bottom half is to combine two results. The dag model of parallel computation is much more general than for simple fork-join algorithms. It describes all the work that is done and the earliest that any piece of that work could begin. To repeat, T 1 and T become simple graph properties: the number of nodes and the length of the longest path, respectively.

This preview has intentionally blurred sections. Sign up to view the full version.

This is the end of the preview. Sign up to access the rest of the document.
• Fall '17
• satish

{[ snackBarMessage ]}

### What students are saying

• As a current student on this bumpy collegiate pathway, I stumbled upon Course Hero, where I can find study resources for nearly all my courses, get online help from tutors 24/7, and even share my old projects, papers, and lecture notes with other students.

Kiran Temple University Fox School of Business ‘17, Course Hero Intern

• I cannot even describe how much Course Hero helped me this summer. It’s truly become something I can always rely on and help me. In the end, I was not only able to survive summer classes, but I was able to thrive thanks to Course Hero.

Dana University of Pennsylvania ‘17, Course Hero Intern

• The ability to access any university’s resources through Course Hero proved invaluable in my case. I was behind on Tulane coursework and actually used UCLA’s materials to help me move forward and get everything together on time.

Jill Tulane University ‘16, Course Hero Intern