Machine Learning

Instructor	Debswapna Bhattacharya (dbhattacharya@vt.edu)
Class meets	Tuesday and Thursday 3:30 pm - 4:45 pm at Pamplin Hall 30
Teaching Assistants	Arman Riasi (armanriasi@vt.edu) Ishtiaque Ahmed Khan (ishtiaqueahmedk@vt.edu)
Office Hours	Debswapna Bhattacharya: Tuesday and Thursday 1:00 pm - 2:00 pm at Torgersen 3120B Arman Riasi: Monday 2:00 pm - 3:00 pm and Wednesday: 4:00 pm - 5:00 pm via Zoom https://virginiatech.zoom.us/my/arman324 Ishtiaque Ahmed Khan: Tuesday and Thursday 11:15 am to 12:15 pm via Zoom https://virginiatech.zoom.us/j/82752141230?pwd=VKMsjhF0B8dj8sbVdftc0sBbN1z9aA.1
Staff Mailing List	cs-4824-ece-4424-f24-staff-g@vt.edu
Piazza	https://piazza.com/vt/fall2024/cs4824ece4424/home
Canvas	CS 4824: https://canvas.vt.edu/courses/195940 ECE 4424: https://canvas.vt.edu/courses/196035

Description

Welcome to CS 4824/ECE 4424: Machine Learning! This is truly an exciting time to be studying Machine Learning, which has evolved as one of the most successful and widely applicable set of techniques across a range of domains (vision, language, speech, biology, robotics), leading to many groundbreaking breakthroughs.
This course will expose students to a wide range of topics in Machine Learning covering their intuitions, mathematical foundations, analyses, and applications. Homework assignments include hands-on experiments with various learning algorithms, and a larger course project gives students a chance to dig deeper into an area of their choice.

Topics

Basics of Statistical Learning

Loss functions, MLE, MAP, Bayesian estimation, bias-variance tradeoff, overfitting, regularization, cross-validation

Supervised Learning

Decision Trees, Naïve Bayes, Logistic Regression, Linear Regression, Kernels and Kernel Regression, Support Vector Machines, Neural Networks

Unsupervised Learning

EM, Clustering

Graphical Models

Bayesian Networks, Hidden Markov Models

Deep Learning

Convolutional Neural Networks, Recurrent Neural Networks, Attention and Transformer Networks

Advanced Topics

Generative AI, Autoencoders, Generative Adversarial Networks, Diffusion Probabilistic Models

Prerequisites

Probability and Statistics

Distributions, densities, marginalization, moments, typical distributions.

Calculus and Linear Algebra

Matrix multiplication, eigenvalues, positive semi-definiteness, multivariate derivates.

Algorithms

Dynamic programming, basic data structures, complexity.

Programming

This is a demanding class in terms of programming skills. HWs will involve a mix of Python and libraries. You are free to choose any programming language for the project.

Ability to deal with abstract mathematical concepts.

Grading

Homeworks: 60%
Midterm: 10%
Final: 20%
Pop Quiz: 10%

Based on the grading breakdown above, each student's final letter grade for the course will be determined based on the ceiling of the final percentage of points earned. The grade ranges are as follows:

A: 90%-100%
B: 80%-89%
C: 70%-79%
D: 60%-69%
F: Below 60%

Textbooks
None required.

Optional reference books (freely available online):

Machine Learning: a Probabilistic Perspective, Kevin Murphy, MIT Press, 2012
Pattern Recognition and Machine Learning, Christopher Bishop, Springer, 2006
The Elements of Statistical Learning, Trevor Hastie, Robert Tibshirani, and Jerome Friedman, Springer, 2009
Deep Learning, Ian Goodfellow, Yoshua Bengio, and Aaron Courville, MIT Press, 2016
Deep Learning: Foundations and Concepts, Christopher Bishop, Hugh Bishop, Springer, 2024

Schedule

Note: This schedule is tentative and subject to change. All due dates are until 11:59 PM EST.

Week / Day	Date	Topic	Lecture Notes	Readings and Handouts
1 / T	Aug 27	Introduction and Administrativia	[Slides]
1 / R	Aug 29	Function Approximation	[Slides] [Annotated slides]	Bishop: Ch 14.4
2 / T	Sep 3	Decision Trees	[Slides] [Annotated slides]
2 / R	Sep 5	Probability and Estimation	[Slides] [Annotated slides]	Bishop: Ch 1 thru 1.2.3 Bishop: Ch 2 thru 2.2 Pop Quiz 1
3 / T	Sep 10	MLE and MAP	[Slides] [Annotated slides]	HW 1 Out
3 / R	Sep 12	Naïve Bayes	[Slides] [Annotated slides]	Murphy Ch 3.5
4 / T	Sep 17	Gaussian Naïve Bayes	[Slides] [Annotated slides]
4 / R	Sep 19	Logistic Regression	[Slides] [Annotated slides]	HW 1 Due
5 / T	Sep 24	Gradient-based Optimization	[Slides] [Annotated slides]	HW 2 Out
5 / R	Sep 26	Generative vs. Discriminative Classifiers	[Slides] [Annotated slides]	Mitchell Ch 3
6 / T	Oct 1	Linear Regression	[Slides] [Annotated Slides]	HW 2 Due
6 / R	Oct 3	Perceptron	[Slides] [Annotated Slides]	Bishop: Ch 4.1.7
7 / T	Oct 8	Neural Networks I	[Slides] [Annotated Slides]	Bishop: Ch 5.1 thru 5.3.2
7 / R	Oct 10	Neural Networks II	[Slides] [Annotated Slides]	Project Proposal Due
Fall Break
8 / T	Oct 15	Kernels	[Slides] [Annotated Slides]	Bishop: Ch 6 Intro
8 / R	Oct 17	No Class. Football game. Go Hokies!
9 / T	Oct 22	Kernel Perceptron	[Slides] [Annotated Slides]	Bishop: Ch 6.1 thru 6.2 HW 3 Out Pop Quiz 3
9 / R	Oct 24	Support Vector Machines	[Slides] [Annotated Slides]	Bishop: Ch 7.1
10 / T	Oct 29	Graphical models	[Slides] [Annotated Slides]	Bishop: Ch 8.1 An introduction to graphical models by Kevin P. Murphy
10 / R	Oct 31	Expectation Maximization	[Slides] [Annotated Slides]	HW 3 Due
11 / T	Nov 5	Clustering	[Slides] [Annotated Slides]	Bishop: Ch 9.1 thru 9.2
11 / R	Nov 7	Deep Neural Networks I	[Slides] [Annotated Slides]	Goodfellow et al.: Ch 6 Project Midway Progress Due
12 / T	Nov 12	Deep Neural Networks II	[Slides] [Annotated Slides]	Goodfellow et al.: Ch 7, 8
12 / R	Nov 14	Convolutional Neural Networks	[Slides [Annotated Slides]	Goodfellow et al.: Ch 9 HW 4 Out
13 / T	Nov 19	Recurrent Neural Networks	[Slides] [Annotated Slides]	Goodfellow et al.: Ch 10
13 / R	Nov 21	Attention and Transformers	[Slides] [Annotated Slides]	Vaswani et al., Attention is All You Need, NeurIPS, 2017 HW 4 Due
Thanksgiving Break
14 / T	Dec 3	Project Final Report Review	N/A	Sample Answers for Project Report
14 / R	Dec 5	Autoencoders	[Slides] [Annotated Slides]	Goodfellow et al.: Ch 14 Pop Quiz 5
15 / T	Dec 10	GAN and Diffusion Conclusion	[Slides] [Annotated Slides]	Goodfellow et al.: Ch 20 Deep Learning: Foundations and Concepts by Bishop and Bishop: Ch 20 Project Final Report Due

Assignments

Homeworks
Students are expected to work individually on 4 HWs throughout the semester. HWs will involve hands-on implementation and analysis, covering various topics that complement and supplement the lecture topics. HWs will involve a mix of Python and libraries to be submitted electronically via Canvas.

Project
The course project is meant for students to (1) gain experience implementing machine learning models; and (2) try machine learning on problems that interest them. You are encouraged to try out interesting applications of machine learning in various domains such as vision, NLP, speech, computational biology, etc. The project must be done individually in this semester (i.e., no double counting).

Take a look at some project ideas and feel free to use them as templates for planning, but you are not obligated to adhere to them.

The first deliverable (10% of course grade) is a project proposal that is due on October 10. The project proposal should identify the problem, outline your preliminary approach, and propose the metrics for evaluation. It should also discuss a proposed plan containing a breakdown of various tasks and important project milestones. These milestones should be a prediction for planning purposes, but you are not obligated to adhere to them precisely. Your proposal should list at least three recent, relevant papers you will read and understand as background. The project proposal must be written using the following guidelines:

standard 8.5" x 11" page size
11 point or higher font, except text that is part of an image
Times New Roman font for all text, Cambria Math font for equations, Symbol font for non-alphabetic characters (it is recommended that equations and symbols be inserted as an image)
1" margins on all sides, no text inside 1" margins (no header, footer, name, or page number)
No less than single-spacing (approximately 6 lines per inch)
Do not use line spacing options such as "exactly 11 point", that are less than single spaced

The project proposal is required to be between 2 and 3 pages in PDF file format only to be submitted electronically via Canvas. The page limit includes all references, citations, charts, figures, and images.

Project Proposal Grading (50 points)

The course staff will follow the National Science Foundation (NSF)-style evaluation metrics to review and score your project proposal as Excellent (5 points), Very Good (4 points), Good (3 points), Fair (2 points), and Poor (1 point). Two reviews will be sought, each reviewing and scoring the proposals, and the (sum of points x 5) will be your final score for the project proposal.

The final deliverable (20% of course grade) is a project report that is due on December 10 (i.e., on the last day of classes). The final project report should describe the project outcomes in a self-contained manner. Your final project report is required to be between 5 and 6 pages by using the CVPR template, structured like a paper from a computer vision conference, to be submitted electronically via Canvas. Please use this template so we can fairly judge all student projects without worrying about altered font sizes, margins, etc. The submitted PDF can link to supplementary materials including but not limited to code, open access software package via GitHub, project webpage, videos, and other supplementary material. The final PDF project report should completely address all of the points in the rubric described below.

Project Report Grading Rubric (100 points)

Note: We have adapted the following list for our rubric based on questions for evaluating research projects proposed by a former DARPA director George H. Heilmeier and recently used by Dhruv Batra for teaching Deep Learning course at Georgia Tech.

Introduction / Background / Motivation:

(10 points) What did you try to do? What problem did you try to solve? Articulate your objectives using absolutely no jargon.
(5 points) How is it done today, and what are the limits of current practice?
(5 points) Who cares? If you are successful, what difference will it make?

Approach:

(10 points) What did you do exactly? How did you solve the problem? Why did you think it would be successful? Is anything new in your approach?
(5 points) What problems did you anticipate? What problems did you encounter? Did the very first thing you tried work?

Experiments and Results:

(10 points) How did you measure success? What experiments were used? What were the results, both quantitative and qualitative? Did you succeed? Did you fail? Why?

Availability:

(5 points) Is your code available? Did you use open-source license to release your code?
(10 points) How do you plan to disseminate your method? Are the findings available via freely accessible project website and/or GitHub?

Reproducibility:

(10 points) How can others reproduce your results? Are training, validation, and test data freely provided?
(5 points) Are model parameters fully reproducible?

In addition, 25 more points will be distributed based on:

(10 points) Appropriate use of figures / tables / visualizations. Are the ideas presented with appropriate illustration? Are the results presented clearly; are the important differences illustrated?
(5 points) Overall clarity. Is the manuscript self-contained? Can a peer who has also taken Machine Learning understand all of the points addressed above? Is sufficient detail provided?
(10 points) Finally, points will be distributed based on your understanding of how your project relates to Machine Learning. Here are some questions to think about:

What was the structure of your problem? How did the structure of your model reflect the structure of your problem?
What parts of your model had learned parameters (e.g., convolution layers) and what parts did not (e.g., post-processing classifier probabilities into decisions)?
What representations of input and output did the model expect? How was the data pre/post-processed?
What was the loss function?
Did the model overfit? How well did the approach generalize?
What hyperparameters did the model have? How were they chosen? How did they affect performance? What optimizer was used?
What Machine Learning framework did you use?
What existing code or models did you start with and what did those starting points provide?

An intermediate deliverable (not graded but mandatory submission) is a midway project progress check due on November 7. The midway project progress should be in the same format as the project proposal and discuss the progress made and any changes to the original plan. The midway project progress should also contain an updated breakdown of the tasks and the final project milestones. The final project report may not be graded if the midway project progress check is not submitted. If you are struggling to make progress in the project, this would be an ideal time to seek help.

Class participation and Pop Quiz
Students are strongly encouraged to attend all the lectures (exceptions are allowed due to medical reasons or emergencies) and expected to engage in the discussion during the lectures and participate in Q&A. Please inform the course staff via email if you cannot make it to the class. Students are also expected to be actively engaged in class-related discussion on Piazza so that other students may benefit from your questions and our answers. While no attendance will be taken, there will be in-class pop quizes (10% of course grade) requiring your class presence and overall engagement in the classroom.

Note: Students' first point of contact is Piazza (so that other students may benefit from your questions and our answers). If you have a personal matter, create a private piazza post or send an email to the course staff.

Policies

Late policy for deliverables
Late homework policy is as follows:

Full credit when due.
Half credit next 48 hours.
Zero credit after that.

Avoid invoking penalties by starting early and seeking extra help. No penalties for medical reasons or emergencies.
Note that late submissions are NOT allowed (i.e., NOT graded) for the project proposal, midway or the final report.

Regrading requests
Requests for regrading due to grading errors must be submitted to the course staff via email within one week of the release of grades.

Services for Students with Disabilities (SSD) accomondation
Any student who has been confirmed by the University as having special needs for learning must notify the instructor during the first week of classes. Such students are encouraged to work with The Office of Services for Students with Disabilities to help coordinate accessibility arrangements.

Academic integrity
The Undergraduate Honor Code pledge that each member of the university community agrees to abide by states: "As a Hokie, I will conduct myself with honor and integrity at all times. I will not lie, cheat, or steal, nor will I accept the actions of those who do."

Students enrolled in this course are responsible for abiding by the Honor Code. A student who has doubts about how the Honor Code applies to any assignment is responsible for obtaining specific guidance from the course instructor before submitting the assignment for evaluation. Ignorance of the rules does not exclude any member of the University community from the requirements and expectations of the Honor Code. For additional information about the Honor Code, please visit: https://www.honorsystem.vt.edu/

Students enrolled in this course are responsible for abiding by the Honor Code. A student who has doubts about how the Honor Code applies to any assignment is responsible for obtaining specific guidance from the course instructor before submitting the assignment for evaluation. Students are strongly encouraged to consult their faculty members regarding the use of any outside materials as the misuse of these sources may constitute a violation of the Honor Code. Ignorance of the rules does not exclude any member of the University community from the requirements and expectations of the Honor Code.

All assignments submitted shall be considered “graded work” and all aspects of your coursework are covered by the Honor Code. All projects and homework assignments are to be completed individually in this course unless otherwise specified. All written work must be written without help from other sources or people, except for the course instructor, the course TAs, and Student Success Center tutors. It is a violation of the Honor Code in this course to receive help from any other source, including online tutoring sites (including but not limited to Chegg, CourseHero, or GroupMe), or generative AI tools (including but not limited to ChatGPT, GitHub Copilot, and Microsoft Copilot).

The Academic Integrity expectations for Hokies are the same in an online class as they are in an in-person class. Hokies are expected to meet the academic integrity standards at Virginia Tech at all times.

Commission of any of the following acts shall constitute academic misconduct. This listing is not, however, exclusive of other acts that may reasonably be said to constitute academic misconduct. Clarification is provided for each definition with some examples of prohibited behaviors in the Undergraduate Honor Code Manual:

Cheating: Cheating includes the intentional use of unauthorized materials, information, notes, study aids or other devices or materials in any academic exercise, or attempts thereof.
Plagiarism: Plagiarism includes the copying of the language, structure, programming, computer code, ideas, and/or thoughts of another and passing off the same as one's own original work, or attempts thereof.
Falsification: Falsification includes the statement of any untruth, either verbally or in writing, with respect to any element of one's academic work, or attempts thereof.
Fabrication: Fabrication includes making up data and results, and recording or reporting them, or submitting fabricated documents, or attempts thereof.
Multiple Submission: Multiple submission involves the submission for credit – without authorization from the instructor receiving the work – of substantial portions of any work (including oral reports) previously submitted for credit at any academic institution of attempts thereof.
Complicity: Complicity includes intentionally helping another to engage in an act of academic misconduct, or attempts thereof.
Violation of University, College, Departmental, Program, Course, or Faculty Rules: The violation of any University, College, Departmental, Program, Course, or Faculty Rules relating to academic matters that may lead to an unfair academic advantage by the student violating the rule(s).

Note that all electronic work submitted for this course is archived and subjected to automatic plagiarism detection and cheating analysis.
If you have questions or are unclear about what constitutes academic misconduct on an assignment, please speak with the instructor. We take the Honor Code seriously in this course. The normal sanction we will recommend for a violation of the Honor Code is an F* sanction as your final course grade. The F represents failure in the course. The identifies “*” a student who has failed to uphold the values of academic integrity at Virginia Tech. A student who receives a sanction of F* as their final course grade shall have it documented on their transcript with the notation “FAILURE DUE TO ACADEMIC HONOR CODE VIOLATION.” You would be required to complete an education program administered by the Honor System in order to have the “*” and notation “FAILURE DUE TO ACADEMIC HONOR CODE VIOLATION” removed from your transcript. The “F” however would be permanently on your transcript.

Principles of community
Because the course will include in-class discussions, we will adhere to Virginia Tech Principles of Community.

Student well-being support
Supporting the mental health and well-being of students in our class is of high priority to us and Virginia Tech. If you are feeling overwhelmed academically, having trouble functioning, or are worried about a friend, please reach out to any of the following offices:

Cook Counseling:
- 540-231-6557 to schedule an appointment and/or 24/7 crisis support
- ucc.vt.edu for more information
Dean of Students Office:
- 540-231-3787 for general advice
- 540-231-6411 for after-hours crisis
- dos.vt.edu for more information
Hokie Wellness:
- hokiewellness.vt.edu for more information about health and wellness workshops and consultations
Services for Students with Disabilities

540-231-3788 or ssd.vt.edu for more information about accommodations and other disability-related supports

Student Success Center:
- The Student Success Center helps students develop the skills needed to accomplish their academic goals and become self-directed learners. Their free services include individual and group tutoring, peer academic coaching, a Seminar Series on Academic Success, and more. Students can book appointments through Navigate. For instructions and more information, please visit www.studentsuccess.vt.edu.

For a full listing of campus resources check out well-being.vt.edu.

Please also feel free to speak with the instructor. We will make an effort to work with you; we care about you.

Technical support
Technical: For technical support assistance regarding any problems with Canvas, Zoom, or e-mail, please contact 4Help. For technical support issues related to Web-CAT or CodeWorkout, send an e-mail request to webcat@vt.edu providing specific details of the problem or issue. For questions related to programming assignments or homework, ask questions on the class general discussion area on Canvas, where one of the instructors or TAs can provide answers.
Canvas privacy policy: http://www.canvaslms.com/policies/intl-privacy.