SHMEM_hprcta_pdf - Bridging Parallel and Reconfigurable...

Info iconThis preview shows pages 1–7. Sign up to view the full content.

View Full Document Right Arrow Icon

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full Document Right Arrow Icon

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full Document Right Arrow Icon

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full Document Right Arrow Icon
This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: Bridging Parallel and Reconfigurable Bridging Parallel and Reconfigurable Computing with Computing with Multilevel PGAS and SHMEM+ Multilevel PGAS and SHMEM+ Vikas Vikas Aggarwal Aggarwal Alan D. George Kishore Yalamanchalli Changil Yoon Herman Lam Greg Stitt H P R C T A 2 9 H P R C T A 2 9 30 September 2009 NSF CHREC Center ECE Department, University of Florida Outline Outline s Introduction b Motivations b Background s Approach b Multilevel PGAS model b SHMEM+ s Experimental Testbed and Results b Performance benchmarking b Case study: content-based image recognition (CBIR) s Conclusions & Future Work 2 Motivations Motivations s RC systems offer much superior performance b 10x to 1000x higher application speed, lower energy consumption s Characteristic differences from traditional HPC systems b Multiple levels of memory hierarchy b Heterogeneous execution contexts s Lack of integrated, system-wide, parallel-programming models b HLLs for RC do not address scalability/multi-node designs b Existing parallel models insufficient; fail to address needs of RC systems b Productivity for scalable, parallel, RC applications very low 3 Background Background s Traditionally HPC applications developed using b Message-passing models, e.g. MPI, PVM, etc. b Shared-memory models, e.g. OpenMP, etc. b More recently, PGAS models, e.g. UPC, SHMEM, etc. s Extend memory hierarchy to include high-level global memory layer, partitioned between multiple nodes GAS has common goals & concepts s PGAS has common goals & concepts b Requisite syntax, and semantics to meet needs of coordination for reconfigurable HPC systems b SHMEM: Shared MEMory comm. library s However, needs adaptation and extension for reconfigurable HPC systems b Introduce multilevel PGAS and SHMEM+ 4 PGAS: Partitioned, Global Address Space P G A S Background: SHMEM Background: SHMEM Shared and local variables Shared and local variables in SHMEM in SHMEM ased on SPMD; easier to program in than MPI (or PVM) W h y p r o g r a m u s i n g S H M E M W h y p r o g r a m u s i n g S H M E M 5 s Based on SPMD; easier to program in than MPI (or PVM) s Low latency, high bandwidth one-sided data transfers ( put s and get s) s Provides synchronization mechanisms b Barrier b Fence, quiet s Provides efficient collective communication b Broadcast b Collection b Reduction Background: SHMEM Background: SHMEM 1. #include <stdio.h> 2. #include <shmem.h> 3. #include <intrinsics.h> 4. 6. int me, npes, i; 7. int *source, *dest; 8. main() 9. { 15. /* Initialize and send on PE 1 */ 16. if(me == 1) { 17. for(i=0; i<8; i++) source[i] = i+1; 8. /* put source data at PE1 to dest at PE0*/ A r r a y c o p y e x a m p l e A r r a y c o p y e x a m p l e 6 10. shmem_init(); 11. /* Get PE information */ 12. me = my_pe(); 13. source = shmalloc(4*8); 14. dest = shmalloc(4*8); 18. /* put source data at PE1 to dest at PE0*/ 19. shmem_putmem(dest, source, 8*sizeof(dest[0]), 0); 20. } 21. /* Make sure the transfer is complete */ 22. shmem_barrier_all();...
View Full Document

{[ snackBarMessage ]}

Page1 / 25

SHMEM_hprcta_pdf - Bridging Parallel and Reconfigurable...

This preview shows document pages 1 - 7. Sign up to view the full document.

View Full Document Right Arrow Icon
Ask a homework question - tutors are online