CS 5234 Advanced Parallel Computing
Massive Parallel Programming on GPUs
Spring 2013
Monday and Wednesday 2:30-3:45pm at Mcbryde 308

Instructor: Yong Cao
Phone: (540) 231-0415
Email: yongcao@vt.edu
Office Hour: By appointment.

Course Description:

On chip paralllelization has become a new frontier in high performance computing research due to the advances in IC design and the end of "Moore's Law", which is considered to be the free lunch for performance gain without code writting. Graphics Processing Units (GPUs) takes the attention of the computing community for their massive computational power, ubiquitous availability, and ease of programming. For example, the current generation of NVIDIA's Kepler K20X GPU has 2688 computing cores, and mind-blowing 3.95 TeraFLOPs computational capacity, and powering the top one supercomputer of the world (TiTan).

The architecture of these massive parallel computing devices are quite different from the traditonal mult-core design and shared-memory system. Therefore, the parallel algorithm design and implementation on these devices needs additional considerations. A carefully designed algorithm can archieve over hundreds of performance speedup on these devices when compared with the standard sequential implementation on CPUs. But a rough design can decrease the performance.

In this course, we will discuss the architecture of the GPU architecture, understand the advantages and limitations of the device. We will learn how to program on NVIDIA GPUs using CUDA programming framework. This will be archieved by a series of hands-on programming assignments and a final research project for performance enhancement and analysis for some non-embarrassing parallel algorithms.


  • Understand the massive parallel architecture of Graphics Processing Units (GPUs)
    • Features and Constrains
  • Program on GPUs
    • Programing APIs, tools, and techniques
    • Achieve high performance and scalability
  • Analyze parallel computing problems
    • Principles and paradigms for parallel algorithm design
    • Ability to apply to real life applications and algorithms

Course Work

Below is an estimate of the contributions of different parts of your final grade. We reserve the right to adjust these weights, as necessary.

  • Programming Assignments (5-7) 60%
  • Project Presentation & Report 40%

Materials and References

Text book for this course.

  1. Programming Massively Parallel Processors: A Hands-on Approach,  Second Edition. By David Kirk and Wen-mei Hwu.

You can use the following list as reference materials:

  1. NVIDIA CUDA Programming Guide.
  2. NVIDA CUDA website, http://www.nvidia.com/object/cuda_home.html.
  3. UIUC Parallel Programming Course Website: http://courses.engr.illinois.edu/ece408/
  4. High performance computing on graphics processing units, http://hgpu.org/ .
  5. General-Purpose Computation on Graphics Hardware, http://gpgpu.org/


Without a Curve (A Fractional percentages will be rounded to the nearest decimal place.)
90- 91 A-
88- 89 B+
82- 87 B
80- 81 B-
78- 79 C+
72- 77 C
70- 71 C-
68- 69 D+
62- 67 D
60- 61 D-
< 60 F

In the event that a curve is applied to grades, it will be curved approximately as follows:

A 95% of the average score of the Top 10% of the class
B 85% of the average score of the Top 10% of the class
C 75% of the average score of the Top 10% of the class
… and so on.

Management Policies

Final Grade Policy
A grade of "I" will only be given for documented medical emergencies or extreme unforeseen emergencies (no exceptions).

Attendance Policy
Attendance at all scheduled meetings is expected. Illnesses (with written verification from the health center or a doctor) and religious holidays shall be considered excused absences. Personal matters may be excused at the instructor's discretion.

Late Assignment Policy
Assignments will be downgraded 25% for each day late. No exception permitted.

Honor Code Policy
“The Honor Code will be strictly enforced in this course. All assignments submitted shall be considered graded work, unless otherwise noted. All aspects of your coursework are covered by the Honor System. Any suspected violations of the Honor Code will be promptly reported to the Honor System. Honesty in your academic work will develop into professional integrity. The faculty and students will not tolerate any form of academic dishonesty.”

Any work that is not the student’s original work, or another’s work that the student has altered, must be submitted with a copy of the original work. In addition, the source of the work must be clearly cited. Failure to include a copy and proper citation of the original work, with an assignment that is not completely the work of you or your team, will result in a referral to the Honor System. All other policies and regulations (e.g., regarding "academic honesty and plagiarism" including that of on-line sources) as stated in the Graduate Bulletin apply in this course.