Data Analytics II
CS/STAT 5526 Spring 2017



This course will provide an overview of a variety of advanced topics within data analytics. It will cover a breadth of techniques and concepts. It is the second in a sequence of data analytics courses and will build on mastery of the topics covered in the first course.

Class meets Wednesdays from 4:00 PM to 6:45 PM in Durham 261 and NVC R206.

The course homepage is or equivalently

We will share files from a Google Drive directory, accessible here. You must be logged into a Virginia Tech Google Apps account to access these files.

Topics and Goals

Having successfully completed this course, the student will be able to:


The prerequisite is that students must have taken CS/STAT 5525 Data Analytics I.

Please speak with the instructor if you are concerned about your background. Note: If any student needs special accommodations because of any disabilities, please contact the instructor during the first week of classes.

Reading and Materials

This class will be taught using a combination of materials available on the web. We will not be using a textbook for this course.

Grading Breakdown

Based on the grading breakdown above, each student's final grade for the course will be determined by the final percentage of points earned. The grade ranges are as follows:

A 93.3%–100% A- 90.0%–93.3% B+ 86.6%–90.0% B 83.3%–86.6% B- 80.0%–83.3%
C+ 76.6%–80.0% C 73.3%–76.6% C- 70.0%–73.3% D+ 66.6%–70.0% D 63.3%–66.6% D- 60.0%–63.3% F 00.0%–60.0%

Format and Attendance

The class will center around discussion. We will work through the reading materials together. The goal of each class is to understand the ideas presented in the assigned reading as much as is possible. We will work through examples, work together to clarify points of confusion, and discuss the wider implications of the topics from the reading. To make sure this experience works for everyone, class attendance is mandatory. I will take attendance and penalize unexused absences at my discretion.

Summaries and Questions

For each assigned reading, you will submit a writeup in your own words that contains two major components: a summary and at least two questions.

The summary should be 1-2 paragraphs describing what main ideas were presented in the text. You should cover the takeaway points, not the fine details.

You will then write (at least) two questions. The first will be a clarification question. What idea in the text was unclear to you? What do you want to understand better? For example, you might write, "Why is Lemma 3, which is presented without proof, guaranteed to be true?" In theory, clarification questions should have answers that we can attempt to find together as a class.

The second question you will include is a discussion question. You might ask about research ideas, how the technique we studied can be applied, or why a mathematical concept is important. In theory, a discussion question may not have a true answer, but we can speculate and brainstorm about them during our discussion.

You should submit your summary and questions on Canvas as posts in the discussion thread for the class session. These threads are set so you cannot see other posts until you post your own summary and questions.

Class and Topic Schedule

The class schedule is available here.


For the class project, you will conduct research with the goal of producing findings worthy of publication at a conference. The project should be done in groups of 2–4 students, and should feature a novel application of data analytics. You are welcome to incorporate this project with any other research you are working on, and you may include collaborators outside the class as long as you are doing a substantial proportion of the research yourself.

You will first write a 1-page proposal on your project before Spring Break. You are highly encouraged to begin brainstorming about topics as soon as possible. You will write a 6–10 page paper on your findings due at the end of the semester, reporting your contribution, background material, evaluation (experiments and/or analysis), and conclusions. Details on the proposal and paper details are available at ./project_deliverables.html.

Academic Integrity

The tenets of the Virginia Tech Graduate Honor Code will be strictly enforced in this course, and all assignments shall be subject to the stipulations of the Graduate Honor Code. For more information on the Graduate Honor Code, please refer to the GHS Constitution at

This course will have a zero-tolerance policy regarding plagiarism or other forms of cheating. Your homework assignments must be your own work, and any external source of code, ideas, or language must be cited to give credit to the original source. I will not hesitate to report incidents of academic dishonesty to the graduate school.

Principles of Community

Because the course will include in-class discussions, we will adhere to Virginia Tech's Principles of Community. The first two principles are most relevant:

The remaining principles are also important and we will take them seriously as a class.

Disclaimer: This syllabus details the plans for the course, which are subject to change. I will make sure any changes are clearly announced and will always be intended for your benefit.

For visitors outside the course: You are welcome to use the course materials for educational purposes. Do not sell any of this content.