Spring 2004 Data and Information Ph.D.
Qualifying Examination
Examining Faculty
- Athman Bouguettaya
- Edward Fox
- Chang-Tien Lu
Philosophy of Examination
- Since students vary in their abilities regarding written and oral communication,
and since doctoral students are expected to have some skill with each media
type, students will explain their solutions both in writing and orally. Solutions
will be graded based on their clarity as a result of the union of these modes
of expression.
- Students are expected to have studied all works in the reading list. Any
pre-requisite or background knowledge required to understand the works in
the reading list are also expected to be acquired by the student.
- Students are expected to understand those works at the level of a doctoral
student who has taken the equivalent of courses such as CS5604 and CS5614.
- Students are expected to be able to understand a real situation/context/problem
in the information/data area, to be able to synthesize/apply the findings
of multiple papers from the reading list to such problems, and to be able
to formulate an answer outlining how they would approach and solve that problem.
Process and Format
- The examination will be a takehome examination and is expected to be administered
in the beginning of 2004.
- At the beginning of the examination period, all students will receive a
document that contains three questions.
- By the end of the examination period, each student must turn in a written
solution to one of those questions, i.e., the student must choose one out
of two. It is expected that the solutions (to both questions, taken together)
will be no longer than about 15 double spaced typed pages (excluding references)
at 11 point or larger. Specific details about format and length will be provided
along with the questions.
- Also at this time, each student must turn in a PowerPoint presentation or
equivalent that will be used for an oral explanation of the written solution.
Oral explanations, lasting no longer than 30 minutes, will be scheduled as
soon after the end of the exam week as feasible, using VTEL or equivalent
as needed to ensure coverage by students and/or faculty in either Blacksburg
or N. Virginia.
- Written solutions might be expected to have the following approximate format
(although detailed guidelines will be provided during the exam):
- a motivation section making clear the context of the problem/situation
- a clear statement of the problem in terms of concepts and terminology
in the information/data area, that addresses the situation/context
- a review of related literature, drawn mostly from multiple relevant
works in the reading list
- a statement of how the problem can be approached
- a description of the approach to solve the problem
It is important that any assumptions made be clearly stated in the written
solution.
- Oral presentations must follow what is given in the previously turned-in
PowerPoint file or equivalent. They must be completed within a 30 minute period,
in which roughly 20 minutes are for presentation and 10 minutes for answering
questions posed by faculty examiners.
- Each solution will be graded by at least 2 faculty members. A combined grade
will then be assigned for each student based on all faculty input by the area
committee, on a scale of 0-3, as is called for by GPC policies.
Presentation Tips
Oral Examination Schedule
- 2/25(W) 9:00 - 9:40AM (NVC 113,VT Dur 463) : Choudhry M. Zaki Malik
- 2/25(W) 9:40 - 10:20:AM (NVC 113,VT Dur 463): Qi Yu
- 2/25(W) 10:20 - 11:00AM (NVC 113,VT Dur 463): Xumin Liu
- 2/25(W) 12:00 - 12:40PM (NVC 113,VT Dur 463): Mohammad Salman Akram
- 2/26(Th) 2:00 - 2:40PM (NVC 113,VT Bur 123A): Xiaoyan Yu
- 2/26(Th) 2:40- 3:20PM (NVC 113,VT Bur 123A): Bing Liu
Reading List
(Note: Some of the hyperlinks below lead to web pages maintained by the
respective publishers. You may or may not be able to download the articles directly
from these web pages - this depends on the host computer from which the access
is made. To access the articles, we recommend that you go through the VT-subscribed
ACM digital library or IEEE Explore interface).
- A. Arasu, J. Choo, H. Garcia-Molina, A. Paepcke, and S. Raghavan, Searching
the Web, ACM Transactions on Internet Technology, Vol. 1, No. 1,
pages 2-42, Aug 2001.
- S.A. McIlraith, T.C. Son, and H. Zeng, Semantic
Web Services, IEEE Intelligent Systems, Vol. 16, No. 2, March-April
2001.
- J.G. Hayes, E. Peyrovian, S. Sarin, M.T. Schmidt, K.D. Swenson, and R. Weber,
Workflow
interoperability standards for the Internet, IEEE Internet Computing,
Vol. 4, No. 3, pages 37-45, May-June 2000.
- A. Tsalgatidou and T. Pilioura, An Overview of Standards
and Related Technology in Web Services, Distributed and Parallel Databases,
Vol. 12, No. 2, pages 135-162, Sep-Nov 2002.
- D. Kossmann, The
State of the Art in Distributed Query Processing, ACM computing Surveys,
Vol. 32, No. 4, pages 384-421, December 2000.
- A. Sheth and J. Larson, Federated
Database Systems for Managing Distributed, Heterogeneous, and Autonomous Databases,
ACM Computing Surveys, Vol.22, No.3, pages 183-236, Sep 1990.
- Brian Babcock, Shivnath Babu, Mayur Datar, Rajeev Motwani, Jennifer Widom,
Models and Issues in Data Stream Systems,
ACM Symposium on Principles of Database Systems(PODS), pages1-16,
2002.
- S. Shekhar, P. Schrater, W. Raju, and W. Wu, “Spatial
Contextual Classification and Prediction Models for Mining Geospatial Data”,
IEEE Transactions on Multimedia, 4(2): 174-188, 2002.
- Sudipto Guha, Adam Meyerson, Nina Mishra, Rajeev Motwani, Liadan O'Callaghan,
Clustering Data Streams: Theory
and Practice, IEEE Transactions on Knowledge and Data Engineering,
Vol. 15, No.3, pages 515-528, May/June 2003.
- W.G. Teng, M.S. Chen, P.S. Yu, A
Regression-Based Temporal Pattern Mining Scheme for Data Streams, Proceedings
of 29th International Conference on Very Large Data Bases, pages 93-104,
2003.
- H. Wang, W. Fan, P.S. Yu, J. Han, Mining
Concept-Drifting Data Streams Using Ensemble Classifiers, Proceedings
of the ninth ACM SIGKDD International Conference on Knowledge Discovery and
Data Mining, pages 226-235, 2003.