CS 5504 Home Page


Class Syllabus [.pdf file]


Office hours:

  Time Location Contact
Cameron Wed. 10:00am-12:00pm KnowledgeWork II 2212 cameron at cs dot vt dot edu
Zhenwei Cao (TA) Mon. & Thu. 4:00pm-6:00pm McBryde 106 zhenwei at vt dot edu

Schedule

Week Date Topic Text Readings Slides Homework Solutions
1 1/20 Introduction     Lecture 1    
1/22 Cost & Performance Chp1   Lecture 2 Hw1: 1.3, 1.4, 1.6, 1.8, 1.9, 1.12  
2 1/27 Memory Hierarchy & Cache Performance Chp5 & Appendix C   Lecture 3    
1/29 Cache Performance Optimization Chp5 & Appendix C   Lecture 4 Hw2: 5.4, 5.5, 5.6, and questions for Appendix C  
3 2/3 Cache Performance Optimization &DRAM Chp5   Lecture 5    
2/5 OS View of Memory Chp5   Lecture 6 Hw3: 5.10, 5.18  
4 2/10 ISA DataPath Review Appendix B   Lecture 7    
2/12 Introduction to ILP & Pipeline Control     Lecture 8 Hw4: 2.2, 2.3, 2.11  
5 2/17 Hazard Detection & Exception Chp2& Appendix A   Lecture 9    
2/19 Dynamic scheduling (Scoreboard) Chp2 & Appendix A   Lecture 10 Hw5: 2.6, 2.7, 2.12  
6 2/24 Tomasulo and branch prediction Chp2   Lecture 11    
2/26 Multithreading Chp4   Lecture 12 Hw6  
2/27  5:30-7:30 Midterm Exam @ MCB 113

Homework solutions are available on course blackboard

7 3/2 Midterm grade is available on course blackboard.
3/3 Storage Chp6   Lecture 13    
3/5 Network Appendix E   Lecture 14 Project  
8   Spring Break          
9 3/17 Project          
3/19 Project          
10 3/24 Topics: CMP       See the reading list below  
3/26 Topics: CMP and its interconnects          
11 3/31 Topics: CMP and its interconnects          
4/2 Topics: accelerator based multiprocessing          
12 4/7 Topics: accelerator based multiprocessing          
4/9 Topics: memory          
13 4/14 Topics: memory          
4/16 Topics: memory          
14 4/21 Topics: power          
4/23 Topics: power          
15 4/28 Topics: thermal-aware computing          
4/30 Topics: thermal-aware computing          
16 5/5         Project Due  
             

 


CS 5504 reading list

   Here are the forms: paper review form, presentation review form, paper review sample

  How to give a bad talk  a version of David Patterson's slides modified by Rolf Riedi

 Chip multiprocessing

  1. P. Kongetira, K. Aingaran and K. Olukotun, Niagara: A 32-Way Multithreaded SPARC Processor, IEEE Micro, Vol. 25, No. 2, pages 21-29, Mar./Apr. 2005. [David]
  2. B. Sinharoy, R. N. Kalla, J. M. Tendler, R. J. Eickemeyer, J. B. Joyner, POWER5 system microarchitecture, IBM Journal of Research & Development, Vol. 49, No. 4/5, July/Sep. 2005. [Amerjyok]
  3. Pablo Abad, Valentin PUente, Pablo Prieto, and Jose Angel Gregorio, Rotary Router, An Efficeint Architecture for CMP Interconnection Networks. The 34th International Symposium on Computer Architecture, 2007.[Chris K]
  4. Kim, J., Nicopoulos, C., Park, D., Das, R., Xie, Y., Narayanan, V., Yousif, M. S., and Das, C. R. 2007. A novel dimensionally-decomposed router for on-chip communication in 3D architectures. SIGARCH Comput. Archit. News 35, 2 (Jun. 2007). [Giurupoasad]
  5. Kumar, R., Zyuban, V., and Tullsen, D. M. 2005. Interconnections in Multi-Core Architectures: Understanding Mechanisms, Overheads and Scaling. SIGARCH Comput. Archit. News 33, 2 (May. 2005), 408-419. [Ramamoorthi]

Accelerator based multiprocessing

  1. "Introduction to the Cell Multiprocessor," by J. Kahle, et al., IBM Journal of Research and Development, Vol. 49, No. 4/5, July/Sept. 2005, pp. 589-604. [John]
  2. Kapasi, U. J., Rixner, S., Dally, W. J., Khailany, B., Ahn, J. H., Mattson, P., and Owens, J. D. 2003. Programmable Stream Processors. Computer 36, 8 (Aug. 2003), 54-62. [Benjamin]
  3. Tarditi, D., Puri, S., and Oglesby, J. 2006. Accelerator: using data parallelism to program GPUs for general-purpose uses. In Proceedings of the 12th international Conference on Architectural Support For Programming Languages and Operating Systems (San Jose, California, USA, October 21 - 25, 2006). [Ao-Ping]
  4. Yang, X., Yang, X., Yan, X., Xing, Z., Deng, Y., Jiang, J., and Zhang, Y. 2007. A 64-bit stream processor architecture for scientific applications. In Proceedings of the 34th Annual international Symposium on Computer Architecture (San Diego, California, USA, June 09 - 13, 2007). ISCA '07. [Yang]

Memory

  1. Leverich, J., Arakida, H., Solomatnikov, A., Firoozshahian, A., Horowitz, M., and Kozyrakis, C. 2007. Comparing memory systems for chip multiprocessors. In Proceedings of the 34th Annual international Symposium on Computer Architecture (San Diego, California, USA, June 09 - 13, 2007). ISCA '07. [Peng]
  2. Muralimanohar, N. and Balasubramonian, R. 2007. Interconnect design considerations for large NUCA caches. In Proceedings of the 34th Annual international Symposium on Computer Architecture (San Diego, California, USA, June 09 - 13, 2007). ISCA '07. [Chun-Yi]
  3. Li, F., Nicopoulos, C., Richardson, T., Xie, Y., Narayanan, V., and Kandemir, M. 2006. Design and Management of 3D Chip Multiprocessors Using Network-in-Memory. In Proceedings of the 33rd Annual international Symposium on Computer Architecture (June 17 - 21, 2006). [Karl]
  4. Ali-Reza Adl-Tabatabai and Christos Kozyrakis and Bratin Eswaran Saha (Dec 2006). Unlocking Concurrency: Multicore Programming with Transactional Memory. In: ACM Queue, 4(10):24--33. [Chris P]
  5. Da Wang,; Yuanjiang Xie,; Yu Hu,; Huawei Li,; Xiaowei Li,, "Hierarchical fault tolerance memory architecture with 3-dimension interconnect," TENCON 2007 - 2007 IEEE Region 10 Conference , vol., no., pp.1-4, Oct. 30 2007-Nov. 2 2007. [Jacob]

Power

  1. T. Mudge, "Power: A First Class Design Constraint for Future Architectures", High Performance Computer Conference, Dec. 2000. [Jinsik]
  2. Diniz, B., Guedes, D., Meira, W., and Bianchini, R. 2007. Limiting the power consumption of main memory. In Proceedings of the 34th Annual international Symposium on Computer Architecture (San Diego, California, USA, June 09 - 13, 2007). ISCA '07. [David]
  3. Wonyoung Kim, Meta Gupta, Gu-Yeon Wei, and David Brooks, System Level Analysis of Fast, Per-Core DVFS using On-Chip Switching Regulators, HPCA 2008. [Amerjyok]
  4. Ranganathan, P., Ranganathan, P., Leech, P., Irwin, D., and Chase, J. 2006. Ensemble-level Power Management for Dense Blade Servers. SIGARCH Comput. Archit. News 34, 2 (May. 2006) [Chris K]
  5. Xiaorun Wang, Ming Chen, Cluster-level Feedback Power Control for Performance Optimization, HPCA 2008. [Giurupoasad]

Thermal

  1. Mesa-Martinez, F. J., Nayfach-Battilana, J., and Renau, J. 2007. Power model validation through thermal measurements. In Proceedings of the 34th Annual international Symposium on Computer Architecture (San Diego, California, USA, June 09 - 13, 2007). ISCA '07. [Ramamoorthi]
  2. Lin, J., Zheng, H., Zhu, Z., David, H., and Zhang, Z. 2007. Thermal modeling and management of DRAM memory systems. In Proceedings of the 34th Annual international Symposium on Computer Architecture (San Diego, California, USA, June 09 - 13, 2007). ISCA '07. [John]
  3. Donald, J., Martonosi, M., and Martonosi, M. 2006. Techniques for Multicore Thermal Management: Classification and New Exploration. SIGARCH Comput. Archit. News 34, 2 (May. 2006), 78-88. [Benjamin]
  4. Luiz Ramos, Ricardo Bianchini, C-Oracle: Predictive Thermal Management for Data Centers, HPCA 2008. [Ao-Ping]

CS 5504 Help

C references/help
Unix references/help
UNIX text editor help
Vi refs [vi primer #1] [vi primer #2] [vi primer #3] [vi primer #4] [vi faqs]
Emacs refs[emacs primer #1] [emacs primer #2] [emacs primer #3] [emacs manual]
Pico refs [pico primer]
Religious discussions [holy wars] [neutral] [pro emacs] [pro vi]
About the UNix time command [time]

CS 5504 Supplemental Readings

Moore's Law Articles
Spectrum 2000 article
Forget Moore's Law
Intel Battles Moore's Law
Memory
Hitting the memory wall: Implications of the obvious Wulf and McKee, CA News, 23(1), 1995.
The cache performance and optimizations of blocked algorithms. M. S. Lam, E. E. Rothberg, and M. E. Wolf, ASPLOS IV, April 1991.
What Every Programmer Should Know About Memory Ulrich Drepper, Red Hat Inc., November 21, 2007
LMBench References
Optimizing Application Performance: A Case Study Using LMBench M.T. Maxwell and K.W. Cameron, ACM Crossroads Student Magazine, 8(5), September 2002.
LMBench Website
Power Benchmark
SPEC launches standardized energy efficiency benchmark
AMD Beats Intel in Quad-Core Server Power Efficiency
Beware of rigged CPU efficiency study
IPC vs CPI
http://findarticles.com/p/articles/mi_qa3751/is_199705/ai_n8775087