Carnegie Mellon Parralel Computing Notes on Lecture 6

Thread blocks can be scheduled in any order by the system - System assumes no dependencies A lot like ISPC tasks, right? ▪ Threads in a block DO run concurrently - When block begin execution, all threads are running concurrently (these semantics impose a scheduling constraint on the system) A CUDA thread block is itself an SPMD program (like an ISPC gang of program instances) Threads in thread-block are concurrent, cooperating "workers" ▪ CUDA implementation: - A Kepler GPU warp has performance characteristics akin to an ISPC gang of instances (but unlike an ISPC gang, the warp concept is not CUDA programming model concept *) All warps in a thread block are scheduled onto the same core, allowing for high-BW/low latency communication through shared memory variables When all threads in block complete, block resources (shared memory allocations, warp execution contexts) become available for next block * Exceptions to this statement include intra-warp bu...
