Spring 2011 Data and Information Ph.D.
Qualifying Examination
Exam Available January 8, 2011
Examining Faculty
- Edward Fox
- Chang-Tien Lu (Primary Contact)
- T. M. Murali
Philosophy of Examination
- Since students vary in their abilities regarding written and oral communication,
and since doctoral students are expected to have some skill with each media
type, students will explain their solutions both in writing and orally. Solutions
will be graded based on their clarity as a result of the union of these modes
of expression.
- Students are expected to have studied all works in the reading list. Any
pre-requisite or background knowledge required to understand the works in
the reading list are also expected to be acquired by the student.
- Students are expected to understand those works at the level of a doctoral
student who has taken the equivalent of courses such as CS5604 Information Storage and Retrieval, CS5614 Database Management Systems, and CS5984 Introduction to Data Mining.
- Students are expected to be able to understand a real situation/context/problem
in the information/data area, to be able to synthesize/apply the findings
of multiple papers from the reading list to such problems, and to be able
to formulate an answer outlining how they would approach and solve that problem.
Process and Format
- The examination includes a takehome examination that is expected to be administered
in the beginning of 2011.
- At the beginning of the examination period, all students will receive a
document that contains three questions.
- By the end of the examination period, each student must turn in a written
solution to one of those questions, i.e., the student must choose one out
of three. It is expected that the solutions will be no longer than 10
pages (excluding references) at 10 point or larger using IEEE 2-column style format.
- Also at this time, each student must turn in a PowerPoint presentation or
equivalent that will be used for an oral explanation of the written solution.
Oral explanations, lasting no longer than 30 minutes, will
be scheduled as soon after the end of the exam week as feasible, using VTEL
or equivalent as needed to ensure coverage by students and/or faculty in either
Blacksburg or N. Virginia.
- Written solutions might be expected to have the following approximate format
(although detailed guidelines will be provided during the exam):
- a motivation section making clear the context of the problem/situation
- a clear statement of the problem in terms of concepts and terminology
in the information/data area, that addresses the situation/context
- a review of related literature, drawn mostly from multiple relevant
works in the reading list
- a statement of how the problem can be approached
- a description of the approach to solve the problem
It is important that any assumptions made be clearly stated in the written
solution.
- Oral presentations must follow what is given in the previously turned-in
PowerPoint file or equivalent. They must be completed within a 30 minute period,
in which roughly 25 minutes are for presentation and 5 minutes for answering
questions posed by faculty examiners.
- Each solution will be graded by at least 2 faculty members. A combined grade
will then be assigned for each student based on all faculty input by the area
committee, on a scale of 0-3, as is called for by GPC policies.
Tentative Schedule
- 12/6 (Monday), 2010: Complete Reading List Available.
- 1/8 (Saturday), 2011 : Written Examination Available.
- 1/21 (Friday) 5PM, 2011: Written Examination Due.
- 1/27 (Thursday) 4PM, 2011: PowerPoint Presentation File Due.
- 1/28 - 2/4 (Friday), 2011: Oral Examination.
- 2/14 (Monday), 2011: Exam Results due to GPC.
Oral Examination Schedule (NVC 351, BB KWII 1110)
- (1) 2/4 Friday 09:30 - 10:05AM: Eric Fouh Mbindi
- (2) 2/4 Friday 10:10 - 10:45AM: Yating Wang
- (3) 2/4 Friday 2:15 - 2:50PM: Sunshin Lee
- (4) 2/4 Friday 2:55 - 3:30PM: Sumit Shah
- (5) 2/4 Friday 3:40 - 4:15PM: Naren Sundaravaradan
- (6) 2/4 Friday 4:20 - 4:55PM: K S M Tozammel Hossain
Reading List (Available 12/6)
(Note: Some of the hyperlinks below lead to web pages maintained by the
respective publishers. You may or may not be able to download the articles directly
from these web pages - this depends on the host computer from which the access
is made. To access the articles, we recommend that you go through the VT-subscribed
ACM digital library or IEEE Explore interface).
- Andreas Paepcke, Chen-Chuan K. Chang, Terry Winograd, Héctor García-Molina, Interoperability for Digital Libraries Worldwide, Communications of the ACM, Volume 41 Issue 4, April 1998.
- Yves Petinot, C. Lee Giles, Vivek Bhatnagar, Pradeep B. Teregowda, Hui Han,
Isaac Councill, Service Applications: A
Service-oriented Architecture for Digital Libraries, Proceedings
of the 2nd International Conference on Service Oriented Computing, November 2004. Publisher:
ACM Press.
- Greg Janée, James Frew, Digital Libraries
for Spatial Data: The ADEPT Digital Library Architecture, Proceedings
of the 2nd ACM/IEEE-CS Joint Conference on Digital Libraries, July 2002. Publisher: ACM
Press.
- Michael G. Christel, David B. Winkler, C. Roy Taylor, Multimedia
Abstractions for a Digital Video Library, Proceedings of the
Second ACM International Conference on Digital Libraries, July 1997. Publisher: ACM Press.
- Petros Maniatis, Mema Roussopoulos, T. J. Giuli, David S. H. Rosenthal,
Mary Baker, The LOCKSS Peer-to-Peer Digital
Preservation System, ACM Transactions on Computer Systems
(TOCS), Volume 23, Issue 1, February 2005.
- Xifeng Yan, X. Jasmine Zhou, Jiawei Han, Mining Closed Relational Graphs with Connectivity Constraints, Proceedings of the 11th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD-05), pp. 324-333.
- Deepayan Chakrabarti, Christos Faloutsos, Graph Mining: Laws, Generators, and Algorithms, ACM Computing Surveys, Vol. 38, March 2006, Article 2.
- Jimeng Sun, Christos Faloutsos, Spiros Papadimitriou, Philip S. Yu, GraphScope: Parameter-free Mining of Large Time-evolving Graphs, Proceedings of the 13th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD-07), pp. 687-696.
- Apostolos Papadopoulos, Apostolos Lyritsis, Yannis Manolopoulos, SkyGraph: An Algorithm for Important Subgraph Discovery in Relational Graphs, Data Mining and Knowledge Discovery, Vol. 17, No. 1, pp. 57-76, August 2008.
- Venu Satuluri, Srinivasan Parthasarathy, Scalable Graph Clustering Using Stochastic Flows: Applications to Community Discovery, Proceedings of the 15th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD-09), pp. 737-746.
- Volker Gaede, Oliver Gunther, Multidimensional Access Methods (PS), ACM Computing Surveys, Vol. 30, No. 2, June 1998. (Slide)
- Mohamed F. Mokbel, Thanaa M. Ghanem, and Walid G. Aref, Spatio-temporal Access Methods (PS), IEEE Data Engineering Bulletin, 26(2):40-49, June 2003. (Tutorial: Location-aware Query Processing and Optimization, MDM2007)
- Y. Zhou, X. Xie, C. Wang, Y. Gong, and W. Ma, Hybrid Index Structures for Location-based Web Search, Proceedings of the 14th ACM International Conference on information and Knowledge Management, pp. 155-162, 2005.
- Steven Schockaert, Martine De Cock, Neighborhood Restrictions in Geographic IR, Proceedings of the 30th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 167-174, 2007.
- Wei Chen, Yajun Wang, Siyu Yang, Efficient Influence Maximization in Social Networks, Proceedings of the 15th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 199-207, 2009.
- Yu-Ru Lin, Jimeng Sun, Paul Castro, Ravi Konuru, Hari Sundaram and Aisling Kelliher, MetaFac: Community Discovery via Relational Hypergraph Factorization, Proceedings of the 15th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Pages 527-535, 2009.
- Lu Liu, Jie Tang, Jiawei Han, Meng Jiang, and Shiqiang Yang, Mining Topic-level Influence in Heterogeneous Networks, Proceedings of the 19th ACM International Conference on Information and Knowledge Management, pp. 199-208, 2010.
- Yu Wang, Gao Cong, Guojie Song, Kunqing Xie, Community-based Greedy Algorithms for Mining Top-K Influential Nodes in Mobile Social Networks, Proceedings of the 16th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 1039-1048, 2010.