Virginia Tech CS Ph.D. Qualifier Exam

Spring 2026 Data/ML/AI Qualifier

This page is under construction and subject to change.


Committee

  • Chris Thomas (Chair)
  • Tu Vu
  • Pinar Yanardag
  • Anuj Karpatne

Instructions

  • At the beginning of the examination period, all students that do not receive a waiver will be added to a Canvas that contains an assignment with instructions.
  • By the end of the examination period, each student must submit a written solution and a recorded presentation to address those questions on Canvas. Do not e-mail your solutions.
  • Each submission will be graded by at least two faculty members. A combined grade will then be assigned for each student based on all faculty input by the area committee, on a scale of Pass/Fail/Waive, as is called for by GPC policies.

Early Withdrawal Policy

A student registered for the PhD qualifier exam may withdraw at any point of time before the early withdrawal deadline, which is 2/5/2026. After this date, withdrawal is prohibited. Students with questions about this policy should contact the exam chair directly.


Academic Integrity

Discussions among students of the papers identified for the exam are reasonable up until the date the exam is released publicly. Once the exam questions are released, we expect all such discussions will cease as students are required to conduct their own work entirely to answer the qualifier questions. This examination is conducted under the University’s Graduate Honor System. Students are encouraged to draw from other papers than those listed in the exam to the extent that this strengthens their arguments. However, the answers submitted must represent the sole and complete work of the student submitting the answers. Material substantially derived from other works, whether published in print or found on the web, must be explicitly and fully cited. Note that your grade will be more strongly influenced by arguments you make rather than arguments you quote or cite.


Exam Schedule

  • 12/26/2025: Qualifier registration opens (department-wide)
  • 1/12/2026: Students register for the qualifier exam (last day to register)
  • 1/27/2026: Qualifier waiver decisions
  • 1/27/2026: Release of reading lists
  • 2/3/2026: Release of exam questions
  • 3/3/2026: Students submit the written solutions and oral recordings on Canvas
  • 3/16/2026: Qualifier result decisions
  • 3/27/2026: Results released to students

Reading Lists

The reading lists below cover various topics in the area of data and information. You may choose any one of these lists for your exam. You are expected to significantly expand on your selected list while preparing your written solution. You are also welcome to create your own reading list on a topic not listed here relevant to data and information, but that reading list must be approved by your research advisor in writing by 2/1/2026. When submitting your solution, you should attach the e-mail from your advisor approving your reading list as a PDF.

The reading list and qualifying exam topic are not intended to necessarily be your dissertation topic. But you are welcome to make the two overlap if desired. Instead, you will be expected to reason about, write about, conduct a literature search on, and present this topic to demonstrate your ability to conduct doctorate research.

Multimodal Machine Learning

  1. Radford, A., Kim, J. W., Hallacy, C., et al. (2021). Learning Transferable Visual Models From Natural Language Supervision. ICML 2021 (PMLR 139). arXiv:2103.00020. [PMLR] [PMLR PDF] [arXiv] [arXiv PDF]

  2. Li, J., Selvaraju, R. R., Gotmare, A. D., Joty, S., Xiong, C., & Hoi, S. C. H. (2021). Align before Fuse: Vision and Language Representation Learning with Momentum Distillation. NeurIPS 2021. arXiv:2107.07651. [NeurIPS] [NeurIPS PDF] [arXiv] [arXiv PDF]

  3. Peng, X., Wei, Y., Deng, A., Wang, D., & Hu, D. (2022). Balanced Multimodal Learning via On-the-Fly Gradient Modulation. CVPR 2022. arXiv:2203.15332. [CVF] [CVF PDF] [arXiv] [arXiv PDF]

  4. Khattak, M. U., Rasheed, H., Maaz, M., Khan, S., & Khan, F. S. (2023). MaPLe: Multi-Modal Prompt Learning. CVPR 2023. arXiv:2210.03117. [CVF] [CVF PDF] [arXiv] [arXiv PDF]

  5. Li, J., Li, D., Savarese, S., & Hoi, S. (2023). BLIP-2: Bootstrapping Language-Image Pre-training with Frozen Image Encoders and Large Language Models. ICML 2023 (PMLR 202). arXiv:2301.12597. [PMLR] [PMLR PDF] [arXiv] [arXiv PDF]

  6. Zhai, X., Mustafa, B., Kolesnikov, A., & Beyer, L. (2023). Sigmoid Loss for Language Image Pre-Training. ICCV 2023. arXiv:2303.15343. [CVF] [CVF PDF] [arXiv] [arXiv PDF]

  7. Lu, J., Clark, C., Lee, S., et al. (2024). Unified-IO 2: Scaling Autoregressive Multimodal Models with Vision Language Audio and Action. CVPR 2024. arXiv:2312.17172. [CVF] [CVF PDF] [arXiv] [arXiv PDF]


Computer Vision

  1. Edstedt, J., Sun, Q., Bökman, G., Wadenbäck, M., & Felsberg, M. (2024). RoMa: Robust Dense Feature Matching. CVPR 2024. arXiv:2305.15404. [CVF] [CVF PDF] [arXiv] [arXiv PDF]

  2. Wang, S., Leroy, V., Cabon, Y., Chidlovskii, B., & Revaud, J. (2024). DUSt3R: Geometric 3D Vision Made Easy. CVPR 2024. arXiv:2312.14132. [CVF] [CVF PDF] [arXiv] [arXiv PDF]

  3. Cabon, Y., Stoffl, L., Antsfeld, L., Csurka, G., Chidlovskii, B., Revaud, J., & Leroy, V. (2025). MUSt3R: Multi-view Network for Stereo 3D Reconstruction. CVPR 2025. arXiv:2503.01661. [CVF] [CVF PDF] [arXiv] [DOI]

  4. Barron, J. T., Mildenhall, B., Verbin, D., Srinivasan, P. P., & Hedman, P. (2023). Zip-NeRF: Anti-Aliased Grid-Based Neural Radiance Fields. ICCV 2023. arXiv:2304.06706. [CVF] [CVF PDF] [arXiv] [arXiv PDF]

  5. Kerbl, B., Kopanas, G., Leimkühler, T., & Drettakis, G. (2023). 3D Gaussian Splatting for Real-Time Radiance Field Rendering. ACM Transactions on Graphics (TOG). arXiv:2308.04079. [arXiv] [arXiv PDF]

  6. Wang, H., Wang, J., & Agapito, L. (2023). Co-SLAM: Joint Coordinate and Sparse Parametric Encodings for Neural Real-Time SLAM. CVPR 2023. arXiv:2304.14377. [CVF] [CVF PDF] [arXiv] [arXiv PDF]

  7. Matsuki, H., Murai, R., Kelly, P. H. J., & Davison, A. J. (2024). Gaussian Splatting SLAM. CVPR 2024. arXiv:2312.06741. [CVF] [CVF PDF] [arXiv] [arXiv PDF]


Large Language Models & NLP

  1. Zhang, A. L., Kraska, T., & Khattab, O. (2025). Recursive Language Models. arXiv:2512.24601. [arXiv] [PDF]
  2. Tandon, A., Dalal, K., Li, X., et al. (2025). End-to-End Test-Time Training for Long Context. arXiv:2512.23675. [arXiv] [PDF]
  3. DeepSeek-AI, Guo, D., Yang, D., et al. (2025). DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning. arXiv:2501.12948. [arXiv] [PDF]
  4. Muennighoff, N., Yang, Z., Shi, W., et al. (2025). s1: Simple test-time scaling. arXiv:2501.19393. [arXiv] [PDF]
  5. Betley, J., Tan, D., Warncke, N., et al. (2025). Emergent Misalignment: Narrow finetuning can produce broadly misaligned LLMs. arXiv:2502.17424. [arXiv] [PDF]
  6. Ouyang, S., Yan, J., Hsu, I.-H., et al. (2025). ReasoningBank: Scaling Agent Self-Evolving with Reasoning Memory. arXiv:2509.25140. [arXiv] [PDF]

Generative Models

  1. Chefer, H., Alaluf, Y., Vinker, Y., Wolf, L., & Cohen-Or, D. (2023). Attend-and-Excite: Attention-Based Semantic Guidance for Text-to-Image Diffusion Models. arXiv:2301.13826. [arXiv] [PDF]
  2. Po, R., Yang, G., Aberman, K., & Wetzstein, G. (2023). Orthogonal Adaptation for Modular Customization of Diffusion Models. arXiv:2312.02432. [arXiv] [PDF]
  3. Hertz, A., Mokady, R., Tenenbaum, J., Aberman, K., Pritch, Y., & Cohen-Or, D. (2022). Prompt-to-Prompt Image Editing with Cross Attention Control. arXiv:2208.01626. [arXiv] [PDF]
  4. Huang, X., Li, Z., He, G., Zhou, M., & Shechtman, E. (2025). Self Forcing: Bridging the Train-Test Gap in Autoregressive Video Diffusion. arXiv:2506.08009. [arXiv] [PDF]
  5. Qu, L., Zhang, H., Liu, Y., et al. (2024). TokenFlow: Unified Image Tokenizer for Multimodal Understanding and Generation. arXiv:2412.03069. [arXiv] [PDF]
  6. Assran, M., Duval, Q., Misra, I., et al. (2023). Self-Supervised Learning from Images with a Joint-Embedding Predictive Architecture. arXiv:2301.08243. [arXiv] [PDF]

Knowledge-Guided Machine Learning

  1. Raissi, M., Perdikaris, P., & Karniadakis, G. E. (2019). Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational Physics, 378, 686–707. [DOI]
  2. Li, Z., Kovachki, N., Azizzadenesheli, K., et al. (2020). Fourier Neural Operator for Parametric Partial Differential Equations. arXiv:2010.08895. [arXiv] [PDF]
  3. Lu, L., Jin, P., & Karniadakis, G. E. (2019). DeepONet: Learning nonlinear operators for identifying differential equations based on the universal approximation theorem of operators. arXiv:1910.03193. [arXiv] [PDF]
  4. Huang, J., Yang, G., Wang, Z., & Park, J. J. (2024). DiffusionPDE: Generative PDE-Solving Under Partial Observation. NeurIPS 2024. arXiv:2406.17763. [arXiv] [PDF]
  5. Herde, M., Raonić, B., Rohner, T., et al. (2024). Poseidon: Efficient Foundation Models for PDEs. arXiv:2405.19101. [arXiv] [PDF]
  6. Elhamod, M., Khurana, M., Manogaran, H. B., et al. (2023). Discovering Novel Biological Traits From Images Using Phylogeny-Guided Neural Networks. KDD 2023. arXiv:2306.03228. [arXiv] [PDF]
  7. Karpatne, A., Jia, X., & Kumar, V. (2024). Knowledge-guided Machine Learning: Current Trends and Future Prospects. arXiv:2403.15989. [arXiv] [PDF]

Grading Scale

The exam is graded on a scale of Pass/Fail/Waive by GPC policies.


Use of Generative AI Tools

Students may use generative AI tools in a limited way as a writing aid for their qualifier submission, consistent with common research-publication practices. Specifically, you may use such tools for spelling/grammar fixes, copyediting for clarity and tone, and rewording of text, provided that the intellectual content, such as problem understanding, technical choices, assumptions, arguments, interpretations of the literature, and conclusions, remains entirely your own work. Any factual claims, technical statements, or ideas drawn from external sources must still be cited in the usual way. Using generative AI tools does not remove or reduce your obligation to provide complete and accurate citations, and you remain fully responsible for the correctness and originality of everything you submit as if you submitted it yourself.

Generative AI tools may not be used to generate or substantially shape the substantive solution itself, including (but not limited to) producing proposed methods or experiments, selecting or justifying modeling/design decisions, drafting technical arguments, synthesizing related work in lieu of your own reading, summarizing papers you have not personally read, creating “proofs of work,” or paraphrasing others’ writing to avoid citation. To ensure transparency, each submission must include a brief disclosure stating whether AI tools were used and, if so, which tool(s) and for what limited purpose. Students should be prepared to explain and defend their work if asked, and the committee may request clarification or supporting materials consistent with the Graduate Honor System. Misuse of AI or failure to disclose permitted use will be treated as an academic integrity violation and may result in a Fail grade for the qualifier exam and further action under university policy.