Dropboxcomu1011627journalpdf paper

Info iconThis preview shows page 1. Sign up to view the full content.

View Full Document Right Arrow Icon
This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: r Reading Shared Memory Consistency Models: A Tutorial –  Sarita V. Adve (“Queen of MCMs”), Kourosh Gharachorloo –  hLp://www.hpl.hp.com/techreports/Compaq-­‐DEC/WRL-­‐95-­‐7.pdf The Java Memory Model –  Jeremy Manson, William Pugh, Sarita Adve –  hLp://dl.dropbox.com/u/1011627/journal.pdf (paper) –  hLp://cseweb.ucsd.edu/classes/fa05/cse231/Fish.pdf (slides) –  See also: hLp://www.cs.umd.edu/~pugh/java/memoryModel/ (resources) Founda>ons of the C++ Concurrency Memory Model –  Hans Boehm, Sarita Adve –  hLp://www.hpl.hp.com/techreports/2008/HPL-­‐2008-­‐56.html –  See also: hLp://www.hpl.hp.com/personal/Hans_Boehm/c++mm/ The C# Memory Model in Theory and Prac>ce –  hLp://msdn.microso].com/en-­‐us/magazine/jj863136.aspx CSEP 524: Parallel ComputaCon Winter 2013: Chamberlain 50 Discuss Boehm Paper Here Performance Impacts of Locks Fixing RRWW bugs with locks Pthreads pthread_mutex_t totTimeMutex; pthread_mutex_init(&totTimeMutex, NULL); create tasks … pthread_mutex_lock(&totTimeMutex); totTime += myTime; pthread_mutex_unlock(&totTimeMutex); … join tasks Chapel var totTime$: sync real = 0.0; coforall tid in 0..#numTasks { … totTime$ += myTime; … } pthread_mutex_destroy(&totTimeMutex); What’s the performance problem with these codes as we increase the number of tasks? CSEP 524: Parallel ComputaCon Winter 2013: Chamberlain 53 The Problem: totTime becomes a boileneck 0.1 Task 0 0.2 Task 1 0.3 Task 2 0.4 Task 3 0.5 Task 4 0.6 Task 5 0.7 0.8 Task 6 Task 7 Depth: O(1) Conten3on: O(#tasks) Whether or not this is a problem depends on the architecture and parameters 3.6 totTime •  but in pracCce it’s not scalable CSEP 524: Parallel ComputaCon Winter 2013: Chamberlain 54 0.1 Task 0 0.2 Task 1 Fix: Use a Reduc>on 0.3 Task 2 0.4 Task 3 0.5 Task 4 0.6 Task 5 0.7 0.8 Task 6 Task 7 totTime CSEP 524: Parallel ComputaCon Winter 2013: Chamberlain 55 0.1 Task 0 0.2 Task 1 Fix: Use a Reduc>on 0.3 Task 2 0.7 0.3 0.4 Task 3 0.5 Task 4 1.1 0.6 Task 5 0.7 0.8 Task 6 Task 7 1.5 totTime CSEP 524: Parallel ComputaCon Winter 2013: Chamberlain 56 0.1 Task 0 0.2 Task 1 Fix: Use a Reduc>on 0.3 Task 2 0.7 0.3 0.4 Task 3 0.5 Task 4 1.1 0.6 Task 5 0.7 0.8 Task 6 Task 7 1.5 2.6 1.0 totTime CSEP 524: Parallel ComputaCon Winter 2013: Chamberlain 57 0.1 Task 0 0.2 Task 1 0.3 1.0 Fix: Use a Reduc>on 0.3 Task 2 0.7 0.4 Task 3 0.5 Task 4 1.1 2.6 0.6 Task 5 0.7 0.8 Task 6 Task 7 1.5 Depth: O(log2#tasks) Conten3on: O(1) What if we used a tree with degree d? 3.6 totTime CSEP 524: Parallel ComputaCon Winter 2013: Chamberlain 58 0.1 Task 0 0.2 Task 1 0.3 1.0 Fix: Use a Reduc>on 0.3 Task 2 0.7 0.4 Task 3 0.5 Task 4 0.6 Task 5 0.7 0.8 Task 6 Task 7 1.1 1.5 2.6 What to do with the result? 1)  Leave it with one task 3.6 totTime CSEP 524: Parallel ComputaCon Winter 2013: Chamberlain 59 3.6 0.1 Task 0 3.6 0.2 Task 1 3.6 0.3 3.6 1.0 Fix: Use a Reduc>on 3.6 0.3 Task 2 3.6 0.7 3.6 0.4 Task...
View Full Document

Ask a homework question - tutors are online