CS 5614: Spring 2017
(Big) Data-Management


This is the age of big-data, with numerous scientific and commercial applications. This course will cover techniques and systems for big data management and mining. We will start from foundational topics in Relational Database Systems and move on to newer systems e.g. Hadoop capable of handling massive datasets. We will also discuss specific algorithms for data mining and machine learning designed for analyzing very large amounts of data.

Why we do data analytics?

Course Information

Textbooks and Resources

The first one is required, and the second one is recommended (especially if you are not familiar with databases):


Handouts and Practice Problems


Note: Solutions will be posted on Piazza.

Schedule (tentative)

For lecture slides and readings, go here.
  1. Introduction
  2. Relational Database Systems
  3. Big data Technologies (MR and new software stack)
  4. Streams
  5. Recommendation Systems
  6. Large Scale Machine Learning
  7. Graph Mining


Amazon's AWS in Education grant program for generously providing support for Amazon Web Services.