About

  • I am a PhD candidate in Computer Science at Virginia Tech since August 2022, advised by Prof. Eugenia Rho. I received my Master's degree in Biostatistics from Southern Medical University in Guangzhou, China in 2017, advised by Prof. Pingyan Chen. I am broadly interested in natural language processing (NLP) and computational social science, with a focus on human-computer interaction (HCI) in developing and evaluating an interactive AI system to assist people. I have also conducted research on epidemiology and public health, including pandemic surveillance, risk assessment, transmission modeling, and field investigations.

Resume

Education

  1. PhD Candidate in Computer Science

    Virginia Tech
    Aug 2022 ➔ Present
  2. Master of Medicine in Biostatistics

    Southern Medical University
    Sep 2014 ➔ Jul 2017
  3. Bachelor of Science in Applied Statistics

    Southern Medical University
    Sep 2010 ➔ Jul 2014

Experience

  1. PhD Candidate

    Virginia Tech
    Aug 2022 ➔ Present
    • Statistical Modeling (R): Led large-scale quantitative research on online user behavior, surveying 1,100+ participants and employing advanced statistical modeling in R (e.g., Structural Equation Modeling, Mixed-Effects Models) to analyze motivations and barriers related to social media counterspeech
    • Small-Sample NLP Evaluation: Developed high-performance NLP models under limited data conditions, achieving ~87% predictive accuracy for counterspeech efficacy. This involved engineering and evaluating transformer-based models (e.g., RoBERTa) against state-of-the-art LLMs (GPT, Claude, Gemini) on a specialized dataset of 600+ curated text samples
    • Research Publication Impact: Published key findings from this research in premier Human-Computer Interaction (HCI) academic venues, including ACM ToCHI (journal) and CSCW (conference)
    • R
    • Python
    • NLP
    • HuggingFace
    • Statistical Modeling
    • HCI Research
  2. Assistant Research Fellow

    Guizhou Center for Disease Control and Prevention
    Nov 2019 ➔ Jul 2022
    • Early COVID-19 Research: Pioneered early-phase COVID-19 epidemiological analysis, being among the first global researchers to estimate its incubation period using Accelerated Failure Time models and the effective reproduction number via the Wallinga & Teunis method
    • Workflow Engineering & Automation: Engineered an automated R and SQL workflow for ingesting, cleaning, and transforming weekly surveillance data (3.8M pop.), culminating in the automated generation of epidemiological summaries with key public health metrics
    • Text Data Processing: Enhanced data extraction capabilities by fine-tuning a compact NLP model (bert-small) to parse and quantify information from textual epidemiological reports, standardizing inputs for an Excel-based centralized data repository
    • Pandemic Risk Assessment: Developed and validated an Autoregressive Distributed Lag (ARDL) model in R, leveraging 177 weeks of internet search trend data (correlating with 173K pediatric HFMD cases) to forecast incidence rates, integrated into the provincial early warning system
    • R
    • SQL
    • Epidemiological Modeling
    • NLP
    • Survival Analysis
    • Public Health
  3. Software Engineer

    Contract Research Organization of CLT Inc
    Jul 2017 ➔ Jun 2019
    • System Design & Development: Led design and development as chief programmer for core clinical trial systems (Central Randomization & EDC), defining database architecture (MySQL), data validation rules, and system functionalities using PHP, JavaScript, HTML/CSS
    • Central Randomization System: Engineered a web-based Central Randomization System deployed in 20+ multi-center clinical trials, featuring allocation concealment algorithms tailored to trial-specific statistical needs and enabling automated data export to SAS datasets
    • Electronic Data Capture System: Developed a scalable Electronic Data Capture (EDC) SaaS platform managing clinical trial data for 100K+ patients, significantly improving real-time data access and data management efficiency for research teams
    • PHP
    • JavaScript
    • MySQL
    • HTML
    • CSS
    • SaaS
  4. Research Intern

    National Clinical Research Center for Kidney Disease
    Nov 2015 ➔ Feb 2017
    • CRF Design & Data Standardization: Contributed to a large-scale chronic kidney disease cohort study by co-designing Case Report Forms (CRFs), focusing on rigorous data standardization (defining variable formats, units, and ranges) to ensure high-quality data collection for subsequent statistical analysis
    • Cohort Study
    • Study Design
    • Biostatistics
    • Data Standardization

Publications

Perceiving and countering hate: The role of identity in online responses

Authors

Kaike Ping, James Hawdon, Eugenia H Rho

CSCW '25

Behind the Counter: Exploring Motivations and Barriers of Online Counterspeech Writing

Authors

Kaike Ping, Anisha Kumar, Xiaohan Ding, Eugenia H Rho

ACM Transactions on Computer-Human Interaction

Epidemiologic Characteristics of COVID-19 in Guizhou Province, China

Authors

Kaike Ping, Mingyu Lei, Yun Gou, Zhongfa Tao, Guanghai Yao, Can Hu, Qin Tao, Zhiting Zou, Dingming Wang, Shijun Li, and Yan Huang

The Journal of Infection in Developing Countries

A multi-level benchmark for causal language understanding in social media discourse

Authors

Xiaohan Ding, Kaike Ping, Buse Çarık, Eugenia Rho

EMNLP '25

Designing Human-AI Collaboration to Support Learning in Counterspeech Writing

Authors

Xiaohan Ding, Kaike Ping, Uma Sushmitha Gunturi, Buse Carik, Sophia Stil, Lance T Wilhelm, Taufiq Daryanto, James Hawdon, Sang Won Lee, Eugenia H Rho

2025 IEEE Symposium on Visual Languages and Human-Centric Computing (VL/HCC)

Exploring large language models through a neurodivergent lens: use, challenges, community-driven workarounds, and concerns

Authors

Buse Carik, Kaike Ping, Xiaohan Ding, Eugenia H Rho

Proceedings of the ACM on Human-Computer Interaction

BioDolphin as a comprehensive database of lipid–protein binding interactions

Authors

Li-Yen Yang, Kaike Ping, Yunan Luo, Andrew C McShan

Communications Chemistry

A New Preprocedure Risk Score for Predicting Contrast-Induced Acute Kidney Injury

Authors

Chongyang Duan, Yingshu Cao, Yong Liu, Lizhi Zhou, Kaike Ping, Ming T. Tan, Ning Tan, Jiyan Chen, and Pingyan Chen

Canadian Journal of Cardiology

Initial Clinical Characteristics of 146 Patients with COVID-19 Reported in Guizhou Province, China: A Survival Analysis

Authors

Yun Gou, Kaike Ping, Mingyu Lei, Chun Yu, Ying Tao, Can Hu, Zhongfa Tao, Zhiting Zou, Weijia Jiang, Shijun Li, Li Zhuang, Zhaobin Liu, and Yan Huang

The Journal of Infection in Developing Countries

Crowdsourcing assessment of maternal blood multi-omics for predicting gestational age and preterm birth

Authors

Adi L. Tarca, Bálint Ármin Pataki, Roberto Romero, The DREAM Preterm Birth Prediction Challenge Consortium (including Kaike Ping), et al.

Cell Reports Medicine

Cervical Rotatory Manipulation Decreases Uniaxial Tensile Properties of Rabbit Atherosclerotic Internal Carotid Artery

Authors

Shaoqun Zhang, Ji Qi, Lei Zhang, Chao Chen, Shubhro Mondal, Kaike Ping, Yili Chen, and Yikai Li

Evidence-Based Complementary and Alternative Medicine