ISCS 2015

Lectures with hand-on tutorial will be given by Prof. P Sadayappan, Ohio State University and Devi Sudheer Kumar, IBM Research, India.

Objective of tutorial: The one day tutorial will focus on Parallel Programming and Performance Optimization techniques on GPUs.

Time	Sessions	Topic
8:30 AM- 10:00 AM	Session I	Review of shared-memory parallel programming with OpenMP Introducing GPU architecture Different types of memory on GPU Multidimensional thread space Introduction to CUDA programming
10:00 AM - 10:30 AM		Tea Break
10:30 AM- 12:00 PM	Session II	1 hour 10 min (lecture) + 20 min (hands-on) Scheduling and synchronization on GPUs Fundamental factors affecting performance: Coalesced global memory access Warp occupancy to tolerate global memory latency Contrast in loop optimizations for CPU versus GPU (Forms of the loops in the CUDA reverse of OpenMP [stride 1 vs. stride N]) Hands-on exercises
12:00 PM- 1:00 PM		Lunch Break
1:00 PM- 2:30 PM	Session III	1hour (lecture) + 30 min (hands-on exercises) GPU performance optimization Reduction of global memory accesses via shared memory or caching Thread coarsening Choice of effective grid/thread-block size/shape Illustrative examples of performance optimization Hands-on exercises
2:30 PM- 3:00 PM		Tea Break
3:00 PM- 4:30 PM	Session IV	GPU performance optimization Minimum Thread divergence. Avoiding bank conflicts Illustrative examples of performance optimization Hands-on exercises

Lecture outline:

Instructor:

Swami
Prof. P Sadayappan

Professor, Computer Science & Engineering

Ohio State University

Devi Sudheer Kumar

IBM Research, Delhi, India

Email for communication: iscsgpu@googlegroups.com