Raquib Bin Yousuf, PhD — AI/ML Researcher & Applied ML Engineer

Latest

News

2026

Defended the PhD Dissertation

Successfully defended my PhD dissertation, titled Improving LLM Reasoning and Retrieval for Structured and Complex Information Spaces, at Virginia Tech. Many thanks to my advisor Dr. Naren Ramakrishnan and the committee memberes, Dr. Chang-Tien Lu, Dr. Chris North, Dr. Xuan Wang, Dr. Sathappan Muthiah. Read my dissertation here.

2026

Paul E. Torgersen Graduate Research Excellence Award finalist

Selected as a PhD finalist for Virginia Tech’s Paul E. Torgersen Graduate Student Research Excellence Award. See the awardees list for more information.

2026

Metadata-aware RAG work accepted at ECIR 2026

Utilizing Metadata for Better Retrieval-Augmented Generation was accepted to the 48th European Conference on Information Retrieval. Read the paper here.

2025

Product provenance verification paper accepted at AAAI 2026

Research on data valuation for product provenance verification was accepted to the AAAI Conference on Artificial Intelligence. Read the paper here.

2024

Best Paper at IEEE BigData 2024

LLM Augmentations to Support Analytical Reasoning over Multiple Documents received the Best Paper Award.

Impact Highlights

70% average relative improvement over text-only retrieval baselines using metadata-aware dual-encoder methods for RAG.
86% schema-alignment reliability in a human-in-the-loop agentic data analysis system used across hundreds of newsroom sessions.
25% classification improvement and 50% generation improvement from memory-augmented LLM architectures for multi-document reasoning.
1–6% of advertised context capacity was enough to reveal memory-drift onset in a graph-based long-context LLM benchmark.
70% reduction in synthetic tabular data rule violations through permutation-aware generation and uncertainty-guided fine-tuning methods.
11% improvement over prior scientific information extraction baselines and 26% higher salient task/method extraction from scholarly documents.
59 wood products assessed, 260+ tons of allegedly illegal timber identified, and 9+ enforcement investigations supported through ML-assisted provenance workflows.

Research & Technical Expertise

LLMs & Generative AI: Large Language Models, Prompt Engineering, Retrieval-Augmented Generation, Model Context Protocol (MCP), Long-context Reasoning, LLM Evaluation & Quality Assessment, Reward Modeling, Synthetic Data Generation, Quantization, Model Serving & Deployment.
Model Adaptation & Learning: Fine-tuning, SFT, RLHF, PEFT, LoRA/QLoRA, Representation Learning, Recommender Systems, Supervised Learning, Feature Engineering, Explainability, Graph-based Modeling, Transfer Learning, Data Valuation, Uncertainty-aware Learning.
Systems & Applications: Search and Retrieval, Embeddings & Vector Databases, Information Extraction, Structured Data Reasoning, Forecasting, Spatiotemporal Modeling, Data Engineering, Evaluation Pipelines, Human-in-the-loop Systems, Decision-support Workflows.
Frameworks & Tools: PyTorch, Transformers, PEFT, TRL, LangChain, OpenAI, Claude, Gemini, Elasticsearch, FastAPI, AWS SageMaker, AWS EC2, GCP Compute Engine, Azure, scikit-learn, spaCy, NetworkX, Pandas, NumPy, Streamlit, Docker, Git.
Programming: Python, Java, C++, MATLAB, R, SQL/NoSQL.

Selected Projects

Selected work highlighting system design, evaluation, deployment context, and measurable outcomes.

RAGMate — Metadata-Aware Retrieval for RAG

Designed metadata-aware dual-encoder retrieval methods that incorporate structured disambiguation signals into embedding and ranking objectives.
Improved retrieval performance by 70% on average over text-only baselines in retrieval-augmented generation settings.
Developed in collaboration with Vectorize.io, with attention to practical retrieval concerns including schema ambiguity, metadata use, source grounding, ranking quality, and evaluation.

Speculatores: Memory-Augmented RAG for Multi-Document Reasoning

Built memory-augmented architectures for LLM-based reasoning across multiple documents, using persistent context representations and cross-document evidence linking.
Combined retrieval, structured memory, and generation to support analytical reasoning over long, distributed information spaces.
Achieved 25% relative improvement in classification performance and 50% improvement in generation quality for multi-document analytical reasoning tasks.

MemoryDrift: Benchmarking Long-Context Reliability in LLMs

Developed a graph-based benchmark to evaluate whether LLMs can maintain stable structured memory as context length increases.
Studied how models induce, update, and preserve graph-like representations over long contexts, exposing reliability failures that are difficult to detect with standard long-context tests.
Revealed memory-drift onset at only 1–6% of advertised context capacity, showing that nominal context length can substantially overstate reliable reasoning capacity.

DataWeave — Human–LLM System for Structured Data Analysis

Built an interactive human–LLM system for exploratory structured data analysis and analytical reasoning over complex datasets.
Designed workflows for schema understanding, user-guided analysis, grounded generation, and iterative refinement.
Enabled a human-in-the-loop agentic structured data analysis system with 86% schema-alignment reliability across hundreds of newsroom sessions.
Developed in collaboration with The Chronicle of Higher Education as a deployed system and research platform for studying AI-assisted data analysis.

Newsroom LLM Systems — The Washington Post

Fine-tuned and evaluated LLMs on AWS EC2 and SageMaker for newsroom applications including subheadline generation, summarization, and question answering.
Contributed to the early development of the Ask the Post chatbot, advising on RAG design, grounding, and evaluation in collaboration with newsroom stakeholders.
Presented LLM evaluation, fine-tuning, and model development work to engineering leadership and newsroom audiences.
Focused on editorial quality, reliability, answer grounding, and model behavior in newsroom use cases.

Product Provenance Verification — ML for Supply-Chain Traceability

Led development of ML systems for product provenance verification, combining probabilistic modeling, data valuation, and spatiotemporal reasoning for compliance workflows.
Supported assessment of 59 wood products, identification of 260+ tons of allegedly illegal timber, and 9+ enforcement investigations.
Developed in collaboration with World Forest ID on real-world traceability problems involving noisy data, uncertain labels, and regulatory decision-support needs.
Contributed to research on optimizing product provenance verification using data valuation methods.

Migration Forecasting — Policy-Relevant Predictive ML

Built large-scale forecasting pipelines for migration patterns and land border encounters using applied ML, statistical modeling, and spatiotemporal data.
Developed real-time and policy-facing predictive workflows designed for decision support under uncertainty.
Worked on projects funded by government and research partners, emphasizing scalable pipelines, evaluation, and decision-support relevance.

Scientific Information Extraction — Domain-Adapted Transformers

Designed a full-text scientific information extraction system using domain-adapted transformer models and task-specific representation learning objectives.
Achieved 11% improvement over prior baselines and 26% higher accuracy in salient task and method extraction from scholarly documents.
Worked with collaborators at CSET, Georgetown University, to extract structured signals from scholarly documents for science-of-science analysis.

Data-Centric LLM Fine-Tuning & Synthetic Table Generation

Introduced permutation-aware tabular data generation methods to reduce invalid synthetic data generation by LLMs.
Reduced synthetic table rule violations by 70% through structured generation constraints and evaluation.
Contributed to Fisher information-guided regularization for language model fine-tuning, improving generalization in low-data regimes across 9/10 GLUE tasks with no added computational overhead.

Software, Systems & Open Source

DataWeave — Interactive human–LLM system for exploratory structured data analysis.
Speculatores — Memory-augmented RAG framework for multi-document reasoning.
RAGMate — Metadata-aware retrieval methods for retrieval-augmented generation.
MemoryDrift — Benchmark for analyzing memory drift in long-context LLM reasoning.

Publications

Peer-reviewed papers, manuscripts under review, and extended abstracts. See Google Scholar for citation details and updates.

LLMs, RAG, Structured Reasoning & Information Extraction

Utilizing Metadata for Better Retrieval-Augmented Generation
Raquib Bin Yousuf, Shengzhe Xu, Mandar Sharma, Andrew Neeser, Chris Latimer, Naren Ramakrishnan.
Proceedings of the 48th European Conference on Information Retrieval (ECIR 2026). Accepted.
LLM Augmentations to Support Analytical Reasoning over Multiple Documents
Raquib Bin Yousuf, Nicholas Defelice, Mandar Sharma, Shengzhe Xu, Naren Ramakrishnan.
Proceedings of the IEEE International Conference on Big Data, 2024. Best Paper
Can an LLM Induce a Graph? Investigating Memory Drift and Context Length
Raquib Bin Yousuf, Aadyant Khatri, Shengzhe Xu, Mandar Sharma, Naren Ramakrishnan.
Proceedings of the IEEE International Conference on Knowledge Graph (ICKG), 2025.
Information Guided Regularization for Fine-tuning Language Models
Mandar Sharma, Nithin Muralidhar, Shengzhe Xu, Raquib Bin Yousuf, Naren Ramakrishnan.
Proceedings of the 1st Conference on Language Modeling (COLM), 2024.
Why LLMs Are Bad at Synthetic Table Generation (and what to do about it)
Shengzhe Xu, Cho-Ting Lee, Mandar Sharma, Raquib Bin Yousuf, Nikhil Muralidhar, Naren Ramakrishnan.
Structured Knowledge for LLMs Workshop at ACM KDD, 2025.
DataWeave: Interactive Human–LLM Systems for Exploratory Structured Data Analysis
Raquib Bin Yousuf, et al. Under review, 2026.
Schema-Aware Harnesses for Tabular Reasoning with Language Models
Eunice Son, Raquib Bin Yousuf, et al. Under review, 2026.
What Should Search Retrieve Now?
Eunice Son, Raquib Bin Yousuf, et al. The 3rd Search Futures Workshop, ECIR 2026.
Lessons from Deep Learning Applied to Scholarly Information Extraction: What Works, What Doesn’t, and Future Directions
Raquib Bin Yousuf, Subhodip Biswas, Kulendra Kumar Kaushal, James Dunham, Rebecca Gelles, Sathappan Muthiah, Nathan Self, Patrick Butler, Naren Ramakrishnan.
Data-driven Science of Science Workshop at ACM KDD, 2022.

Applied ML, Forecasting & Decision Support

Optimizing Product Provenance Verification using Data Valuation Methods
Raquib Bin Yousuf, Hoang Anh Just, Shengzhe Xu, Brian Mayer, Victor Deklerck, Jakub Truszkowski, John C. Simeone, Jade Saunders, Chang-Tien Lu, Ruoxi Jia, Naren Ramakrishnan.
Proceedings of the AAAI Conference on Artificial Intelligence, 2026. Accepted.
Chasing the Timber Trail: Machine Learning to Reveal Harvest Location Misrepresentation
Shailik Sarkar, Raquib Bin Yousuf, Linhan Wang, Brian Mayer, Thomas Mortier, Victor Deklerck, Jakub Truszkowski, John C. Simeone, Marigold Norman, Jade Saunders, Chang-Tien Lu, Naren Ramakrishnan.
Proceedings of the 31st ACM SIGKDD Conference on Knowledge Discovery and Data Mining, 2025.
A Probabilistic Approach to Estimating Timber Harvest Location
Jakub Truszkowski, Roi Maor, Raquib Bin Yousuf, Subhodip Biswas, Caspar Chater, Peter Gasson, Scot McQueen, Marigold Norman, Jade Saunders, John Simeone, Naren Ramakrishnan, Alexandre Antonelli, Victor Deklerck.
Ecological Applications, 35(1): e3077, 2025.
Forecasting Migration Patterns and Land Border Encounters
Raquib Bin Yousuf, Shengzhe Xu, Patrick Butler, Brian Mayer, Nathan Self, David Mares, Naren Ramakrishnan.
Proceedings of the IEEE International Conference on Big Data, 2024.
Mining Developer Questions about Major Web Frameworks
Zakaria Mehrab, Raquib Bin Yousuf, Ibrahim Asadullah Tahmid, Rifat Shahriyar.
International Conference on Web Information Systems and Technologies (WEBIST), 2018.

Experience

Virginia Tech Transportation Institute — AI Researcher 2026–Present

Build data-grounding layers for heterogeneous traffic and mobility data by standardizing schemas, metadata, and data-access patterns.
Develop LLM-based traffic analysis tools that answer natural-language questions through grounded querying, analysis, and summaries.
Create benchmarks to evaluate and improve the reliability of AI tools over complex traffic datasets and external context.

The Washington Post — Machine Learning Intern 2023

Fine-tuned and evaluated LLMs on AWS EC2 and SageMaker to explore newsroom applications including subheadline generation, summarization, and question answering.
Contributed to early development of the Ask the Post chatbot, advising on RAG design and evaluation in collaboration with newsroom stakeholders.
Presented LLM evaluation, fine-tuning, and model development for newsroom applications to engineering leadership and newsroom audiences.

Virginia Tech, Sanghani Center for AI — Graduate Research Assistant 2019–2025

Built LLM, applied ML, NLP, retrieval, and forecasting systems for data-aware reasoning and decision support across funded research projects. Selected work:

Developed memory-augmented architectures for LLM-based multi-document reasoning with persistent context representation and cross-document evidence linking.
Proposed metadata-aware retrieval methods for RAG, incorporating structured disambiguation signals into embedding and ranking objectives in collaboration with Vectorize.io.
Built human-in-the-loop agentic data analysis workflows for real-world newsroom use cases in collaboration with The Chronicle of Higher Education.
Introduced permutation-aware tabular generation and Fisher information-guided regularization for LLM fine-tuning to improve data efficiency and generalization.
Designed full-text scientific information extraction systems using domain-adapted transformer models and task-specific representation learning.
Led development of ML systems for product provenance verification and large-scale migration forecasting, deploying regulatory and real-time pipelines used in policy and compliance settings, including provenance work in collaboration with World Forest ID.
Contributed to funded research proposals for projects supported by DARPA, NSF, and external partners, including technical section writing.

Eastern University Bangladesh — Lecturer 2018

Led advanced programming and digital logic design courses, managed labs, exams, student supervision, and administrative responsibilities.
Served on academic and administrative committees and contributed to departmental teaching and curriculum activities.

Talks & Presentations

LLM Augmentations to Support Analytical Reasoning over Multiple Documents — IEEE BigData 2024. Best Paper
Can an LLM Induce a Graph? Investigating Memory Drift and Context Length — IEEE ICKG 2025.
Lessons from Deep Learning Applied to Scholarly Information Extraction — Data-driven Science of Science Workshop at ACM KDD 2022.
AI-based Traffic Data Analysis Tool — Presented to City of Alexandria traffic authorities as part of a Virginia Tech Smart Mobility Lab user-story workshop, 2026.
LLM Evaluation, Fine-tuning, and Model Development for Newsroom Applications — Presented to engineering leadership and newsroom audiences at The Washington Post, 2023.
LLM Systems for Newsroom Applications — Internal technical presentation at The Washington Post.

Teaching & Communication

Lecturer, Eastern University Bangladesh — Advanced Programming and Digital Logic Design, 2018.
Graduate Teaching Assistant, Virginia Tech — Object-Oriented Programming, Software Design & Data Structures, and Social Media Analytics, 2019–2022.
Developed and delivered instructional materials on large language models and generative AI for undergraduate coursework at Virginia Tech.
Created instructional and presentation materials to support early-stage integration of generative AI into departmental curricula.
Presented research and AI-focused materials in meetings with collaborators and external stakeholders.

Awards & Service

Best Paper Award — IEEE International Conference on Big Data, 2024.
Paul E. Torgersen Graduate Student Research Excellence Award, PhD Finalist — Virginia Tech, 2026.
Dean’s List — Bangladesh University of Engineering and Technology, 2015–2017.
Conference Travel Grants — Virginia Tech, ACM KDD 2022, IEEE BigData 2024, IEEE ICKG 2025.
Reviewer — Conference on Language Modeling (COLM); IEEE Transactions on Big Data.
Mentored undergraduate and junior graduate researchers on ML and LLM projects, contributing to peer-reviewed publications.
Departmental service — Served on academic and administrative committees at Eastern University Bangladesh.

Education

Ph.D. in Computer Science — Virginia Tech 2026

Advisor: Naren Ramakrishnan
Dissertation: Improving LLM Reasoning and Retrieval for Structured and Complex Information Spaces
M.S. in Computer Science — Virginia Tech 2022
B.S. in Computer Science — Bangladesh University of Engineering and Technology (BUET) 2017

Bio

I am an AI/ML researcher and applied ML engineer working on language models, retrieval, representation learning, information extraction, forecasting, recommender systems, and data-centric learning. My work centers on building reliable systems that reason over structured and unstructured data, with attention to grounding, evaluation, adaptation, and practical deployment.

I recently completed my PhD in Computer Science at Virginia Tech, where I developed data-aware AI systems for reasoning and decision-making over complex real-world data. I build systems that operate over multi-source, structured, unstructured, and noisy information, with an emphasis on grounding, uncertainty, evaluation, and practical deployment.

I have worked on AI and ML systems for generative AI, search, newsroom applications, supply-chain traceability, agricultural product verification, migration forecasting, scientific information extraction, and structured-data analysis. Across these projects, I focus on system design, robust evaluation, scalable pipelines, and translating research ideas into useful tools for analysts, researchers, policy teams, and domain experts.

News

Defended the PhD Dissertation

Paul E. Torgersen Graduate Research Excellence Award finalist

Metadata-aware RAG work accepted at ECIR 2026

Product provenance verification paper accepted at AAAI 2026

Best Paper at IEEE BigData 2024

Impact Highlights

Research & Technical Expertise

Selected Projects

Software, Systems & Open Source

Publications

LLMs, RAG, Structured Reasoning & Information Extraction

Applied ML, Forecasting & Decision Support

Experience

Talks & Presentations

Teaching & Communication

Awards & Service

Education

Ph.D. in Computer Science — Virginia Tech 2026

M.S. in Computer Science — Virginia Tech 2022

B.S. in Computer Science — Bangladesh University of Engineering and Technology (BUET) 2017

Bio

Links