CS 4984 & 5984 Accelerator-Based Parallel Computing, Spring 2009
Tuesday and Thursday 3:30-4:45pm at McBryde 110

CS 4984 Instructor:
Instructor: Wuchun Feng
Phone: (540) 231-1192
Email: feng@vt.edu
Office Hour: Tue Noon - 1:30pm MCB 122, and by appoinment in KWII 2209

CS 5984 Instructor:
Instructor: Yong Cao
Phone: (540) 231-0415
Email: yongcao@vt.edu
Office Hour: TR 2:00-3:30pm McBryde 122, and by appointment in KWII 1124

Course Description:

Today, the world is literally running towards parallel programming for chip multiprocessors (CMPs) from traditional multicore architectures (e.g., Intel's Core Duo and AMD's dual-core Opteron) to hybrid multi-core (e.g., Cell Broadband Engine) to many-core (e.g., NVIDIA Tesla). Examples of CMP systems that use such parallel architectures include PCs, laptops, game consoles, mobile handsets, servers, and network routers to name just a few. However, these systems have also resulted in a nightmare for programmers to take full advantage of multiple processing cores; these problems will only worsen as the number of cores per processor continues to increase, most notably, today's many-core architectures from ATi and NVIDIA have 800 and 240 cores per graphics processing unit (GPU), respectively. Thus, the challenge is to develop applications software that effectively uses the parallel processing cores to achieve efficiency and performance goals.

The course will consist of lectures early in the semester, homework assignments, programming projects, paper presentations, and a final project.


Upon completing this course, students should be able to

  1. Design applications software for chip multiprocessors (CMPs), particularly traditional multi-core (e.g., Intel and AMD multi-core) and massively parallel many-core (e.g., NVIDIA and ATi graphics cards).
  2. Understand the fundamental differences in writing parallel code for traditional multi-core versus massively parallel many-core.
  3. Understand the importance of the memory and network subsystems in emerging CMP systems, e.g., reconfigurable multicore, Cell, and GPGPU.
  4. Understand how to transform serial code that is amenable to data parallelism into GPGPU-accelerated code.
  5. Learn about parallel programming principles, parallelism models, communication models, and resource limitations of CMPs.
  6. Explain the different layers of parallelism in a CMP.
  7. Understand the relationship between each of the above layers of abstraction, and more generally, the relationship between the CMP hardware and application software.

Course Work

Below is an estimate of the contributions of different parts of your final grade. We reserve the right to adjust these weights, as necessary.

CS 4984

  • Participation 15%
  • Exams 15%
  • Programming Assignments 35%
  • Project 35%

CS 5984

  • Participation, Quizzes 15%
  • Programming Assignments 20%
  • Paper Presentation 15%
  • Project Proposal 15%
  • Project Presentation & Report 35%

Each student (for CS5984) will be assigned to present 2 papers and lead the discussion at one class session. The paper list can be found in the Resource webpage. Each student also needs to write summaries for the papers which are presented by other students. The detail about requirements for paper presentation and written summaries can be find at this Requirement Webpage.

Materials and References

There is no required Text Book for this course. We use the literatures from conference and journal papers. You can use the following list as reference materials:

  1. NVIDIA CUDA 2.0 Programming Guide, 2008.
  2. NVIDA CUDA website, http://www.nvidia.com/object/cuda_home.html.
  3. UIUC Parallel Programming Course Website: http://courses.ece.uiuc.edu/ece498/al/.
  4. Astro GPU Workshop Videos: http://www.astrogpu.org/videos.php
  5. SC07 GPU Tutorials on GPGPU.org Website: http://www.gpgpu.org/sc2007/
  6. Other Courses on GPGPU at NIVIDA Website: http://www.nvidia.com/object/cuda_university_courses.html
  7. GPU Computing Course at SIGGRAPH Asia: http://sa08.idav.ucdavis.edu/


Without a Curve (A Fractional percentages will be rounded to the nearest decimal place.)
90- 91 A-
88- 89 B+
82- 87 B
80- 81 B-
78- 79 C+
72- 77 C
70- 71 C-
68- 69 D+
62- 67 D
60- 61 D-
< 60 F

In the event that a curve is applied to grades, it will be curved approximately as follows:

A 95% of the average score of the Top 10% of the class
B 85% of the average score of the Top 10% of the class
C 75% of the average score of the Top 10% of the class
… and so on.


The Honor Code will be strictly enforced. It is a violation to represent joint work as your own or to let others use your work; always acknowledge any assistance you received in preparing work that bears your name. You are expected to work independently unless explicitly permitted to collaborate on a particular project. It is not a violation to discuss approaches to programs with others; however, it IS a violation to use code fragments in your program that have been written by others without acknowledging the source.