CS 6604: Fall 2013
Data Mining Large Networks
and Time-Series



For more details, see Slides 54-62 in here.

Late grading, questions and requests

Honor Code

This is a graduate level class---hence I expect students to want to learn and not google for answers. The purpose of problem sets in this class is to help you think about the material, not just give the right answers. As we often reuse problem set questions from previous years, covered by papers and webpages, I expect the students not to copy, refer to, or look at the solutions in preparing their answers.

That said, the VT honor code is in effect for every aspect of this class. You are expected to do your own work. No one may give you answers to homeworks. You are allowed to work on project assignments with the members of your group. In other words, students are encouraged to communicate about general principles of the course, but all assigned homework must be done on an individual basis. You may not exchange any code or solutions, either in pieces or in entirety, by any electronic means or hard copy.


Since project is the main work for this course, it has to be substantial. It can be done in groups of 2-3. The project can be: The project can not be only a survey. And we will follow the no 'double-dipping' policy.


The deliverables include:
  1. Project Proposal: 15%, Due Date: Oct 10.
  2. Project Milestone Report: 10%, Due Date: Nov 10.
  3. Final Report: 20%, Due Date: Dec. 2 or Dec. 4 (TBD).
  4. Final Presentation/Poster in class: TBD.
All the write-ups should be in the ACM SIG format, in either Latex (preferred) or Word.

Project Proposal

The proposal should contain a detailed survey of the related work (at least 6-8 papers, outside of the required class reading list) and identify what are strengths and weaknesses of the papers and how they may be addressed. You should be thinking how these papers are interrelated and at the same time different from each other. You should not just copy the abstract of any paper: that would be plagiarism.

The proposal should then focus on describing the proposed research directions and questions. How precisely do you plan to pursue them? What methods/data do you plan to use? A useful guide as to what a proposal should answer, see Heilmeier's Cathechism. In addition to the survey, the proposal should contain at least some amount of each of the following types of content:
Your proposal should be self-contained. For example, don't just say: "We plan to implement John Doe's Foo-graph algorithm [Doe2001], and we will study its performance with our approach." Instead, you should briefly review the key ideas in the references, and describe clearly the alternatives that you will be examining.

The proposal should be 3-4 pages in the given format, with pictures if they seem useful (more than 4 pages won't be read). Check the grammar and syntax (there will be a small penalty for each typo/grammar error---please do not submit without a spell check). Include the names and email-addresses of the group members. Submit a hard-copy in class, and also mail the PDF to the instructor with subject header 'CS 6604: Project Proposal'. The group members should be cc-ed. Name the PDF like 'LASTNAME1-LASTNAME2-LASTNAME3-proposal.pdf'.

Project Milestone Report

Think of this as a draft of your final report but without your major results. We expect that you have completed 30% of the project. Provide a complete picture of your project even if certain key parts have not yet been implemented/solved. Include the parts of your project which have been completed so far, such as: The milestone report should be 4-5 pages in the given format, with pictures if useful. Mail the PDF to the instructor by 5 pm, Nov. 10 with subject header 'CS 6604: Project Milestone'. The group members should be cc-ed. Name the PDF like 'LASTNAME1-LASTNAME2-LASTNAME3-milestone.pdf'. Also submit the hardcopy in class on Monday Nov. 11. Please keep the graded phase 1, and attach a copy (of the graded proposal) to the hardcopy of your milestone report.

Project Final Report

This should be a detailed description of what you did, your results and what have you learned and/or conclude from your work.
  1. Write-up: A minimum of 5 pages, a maximum (hard-limit) of 8-pages in the given format.
    • [2%] Introduction/Motivation.
    • [3%] Problem Definition.
    • [5%] Related work and Survey.
    • Proposed Method
      • [10%] Intuition: why should it be better than the state-of-art?
      • [25%] Description: your approach, algorithms, models. Be as clear as you can (as otherwise we won't understand what you are trying to do).
    • Experiments/Results
      • [5%] List of questions your experiments are designed to answer, description of your testbed
      • [30%] Details of your experiments, observations, findings. Make sure you also interpret and explain your observations.
    • [5%] Conclusion and Discussion. Feel free to mention any avenues of future work here.
  2. Software: [10%] packaging, documentation, and portability. The goal is to provide enough material, so that other people can use it and continue your work. Create a tar.gz file which contains:
    • A concise, short README.txt file, corresponding to the "user's manual". This file should describe the package in a few paragraphs, how to install it, how to use it, and how to run a demo.
    • A DOC directory, with your writeup, your presentation slides. All your code should be in a SRC directory.
    • Make sure that your package includes only the absolutely necessary set of files! Do NOT just make a 20MB core-dump of all your files and submit!
  3. Web-page: [5%] create a webpage which should contain the title of the project, the names of the members (with portrait pictures preferably), and links to a PDF version of your write-up, your software tar-ball files, and your presentation slides. Also have a short summary of your project and its results on the webpage.
HOW TO SUBMIT: Please note the times, as all are hard deadlines, with no extensions or slip-days allowed.

Project Final Presentation

The project presentations will be on Dec 4 and Dec 9, in class. Sign-up link has been posted on Piazza. 15mins per team---time limits will be strictly enforced. Plan for 12mins maximum for presentation, and 3 mins for questions and transition to the next team. Practice your timing and delivery---giving a good talk is hard! Grading scheme: All students are expected to attend these two lectures, and be prepared to ask (tough!) questions to other project groups.