Publications in reverse chronological order. CVPR, ECCV, and ICCV are the top conferences in computer vision. ACL, EMNLP, and NAACL are the top conferences in natural language processing. NeurIPS, ICLR and ICML are top-tier conferences in machine learning generally.
2024
MetaSumPerceiver: Multimodal Multi-Document Evidence Summarization for Fact-Checking
Ting-Chih Chen, Chia-Wei Tang, and Christopher Thomas
In Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2024
Fact-checking real-world claims often requires reviewing multiple multimodal documents in order to assess the claim’s truthfulness, a highly laborious and time-consuming task. In this paper, we present a summarization model crafted to generate claim-specific summaries useful for fact-checking from multimodal multi-document datasets. The model takes inputs in the form of documents, images, and a claim, with the objective of assisting in fact-checking tasks. We introduce a dynamic perceiver-based model that is able to handle inputs from multiple modalities of arbitrary lengths. To train our model, we leverage a novel reinforcement learning-based entailment objective in order to generate summaries that provide evidence distinguishing between different truthfulness labels. To assess the efficacy of our approach, we conduct experiments on both an existing benchmark as well as a new dataset of multi-document claims which we contribute. Our approach outperforms the SOTA approach by 4.6% in the claim verification task on the MOCHEG dataset and demonstrates strong performance on our new Multi-News-Fact-Checking dataset.
@inproceedings{chen-etal-2024-metasumperceiver,title={{M}eta{S}um{P}erceiver: Multimodal Multi-Document Evidence Summarization for Fact-Checking},author={Chen, Ting-Chih and Tang, Chia-Wei and Thomas, Christopher},editor={Ku, Lun-Wei and Martins, Andre and Srikumar, Vivek},booktitle={Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)},year={2024},address={Bangkok, Thailand},publisher={Association for Computational Linguistics},url={https://aclanthology.org/2024.acl-long.474},pages={8742--8757},}
Semantic Shield: Defending Vision-Language Models Against Backdooring and Poisoning via Fine-grained Knowledge Alignment
Alvi Md Ishmam, and Christopher Thomas
In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024
@inproceedings{ishmam2024semantic,title={Semantic Shield: Defending Vision-Language Models Against Backdooring and Poisoning via Fine-grained Knowledge Alignment},author={Ishmam, Alvi Md and Thomas, Christopher},booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},pages={24820--24830},year={2024},}
Beyond Grounding: Extracting Fine-Grained Event Hierarchies across Modalities
Hammad Ayyubi, Christopher Thomas, Lovish Chum, and 8 more authors
In Proceedings of the AAAI Conference on Artificial Intelligence, 2024
@inproceedings{ayyubi2024beyond,title={Beyond Grounding: Extracting Fine-Grained Event Hierarchies across Modalities},author={Ayyubi, Hammad and Thomas, Christopher and Chum, Lovish and Lokesh, Rahul and Chen, Long and Niu, Yulei and Lin, Xudong and Feng, Xuande and Koo, Jaywon and Ray, Sounak and others},booktitle={Proceedings of the AAAI Conference on Artificial Intelligence},volume={38},number={16},pages={17664--17672},year={2024},}
2023
ACL
Enhanced Chart Understanding via Visual Language Pre-training on Plot Table Pairs
Mingyang Zhou, Yi Fung, Long Chen, and 3 more authors
In Findings of the Association for Computational Linguistics: ACL 2023, 2023
Building cross-model intelligence that can understand charts and communicate the salient information hidden behind them is an appealing challenge in the vision and language (V+L) community. The capability to uncover the underlined table data of chart figures is a critical key to automatic chart understanding. We introduce ChartT5, a V+L model that learns how to interpret table information from chart images via cross-modal pre-training on plot table pairs. Specifically, we propose two novel pre-training objectives: Masked Header Prediction (MHP) and Masked Value Prediction (MVP) to facilitate the model with different skills to interpret the table information. We have conducted extensive experiments on chart question answering and chart summarization to verify the effectiveness of the proposed pre-training strategies. In particular, on the ChartQA benchmark, our ChartT5 outperforms the state-of-the-art non-pretraining methods by over 8% performance gains.
@inproceedings{zhou-etal-2023-enhanced,title={Enhanced Chart Understanding via Visual Language Pre-training on Plot Table Pairs},author={Zhou, Mingyang and Fung, Yi and Chen, Long and Thomas, Christopher and Ji, Heng and Chang, Shih-Fu},editor={Rogers, Anna and Boyd-Graber, Jordan and Okazaki, Naoaki},booktitle={Findings of the Association for Computational Linguistics: ACL 2023},year={2023},address={Toronto, Canada},publisher={Association for Computational Linguistics},url={https://aclanthology.org/2023.findings-acl.85},doi={10.18653/v1/2023.findings-acl.85},pages={1314--1326},}
2022
Fine-Grained Visual Entailment
Christopher Thomas, Yipeng Zhang, and Shih-Fu Chang
In Proceedings of the European Conference on Computer Vision, 2022
@inproceedings{thomas2022fine,title={Fine-Grained Visual Entailment},author={Thomas, Christopher and Zhang, Yipeng and Chang, Shih-Fu},booktitle={Proceedings of the European Conference on Computer Vision},pages={398--416},year={2022},}
Community implications for gun violence prevention during co-occurring pandemics; a qualitative and computational analysis study
Desmond U. Patton, Nathan Aguilar, Aviv Y. Landau, and 9 more authors
This study provides insight into New York City residents’ perceptions about violence after the outbreak of Coronavirus disease (COVID-19) based on information from communities in New York City Housing Authority (NYCHA) buildings. In this novel analysis, we used focus group and social media data to confirm or reject findings from qualitative interviews. We first used data from 69 in-depth, semi-structured interviews with low-income residents and community stakeholders to further explore how violence impacts New York City’s low-income residents of color, as well as the role of city government in providing tangible support for violence prevention during co-occurring health (COVID-19) and social (anti-Black racism) pandemics. Residents described how COVID-19 and the Black Lives Matter movement impacted safety in their communities while offering direct recommendations to improve safety. Residents also shared recommendations that indirectly improve community safety by addressing long term systemic issues. As the recruitment of interviewees was concluding, researchers facilitated two focus groups with 38 interviewees to discuss similar topics. In order to assess the degree to which the themes discovered in our qualitative interviews were shared by the broader community, we developed an integrative community data science study which leveraged natural language processing and computer vision techniques to study text and images on public social media data of 12 million tweets generated by residents. We joined computational methods with qualitative analysis through a social work lens and design justice principles to most accurately and holistically analyze the community perceptions of gun violence issues and potential prevention strategies. Findings indicate valuable community-based insights that elucidate how the co-occurring pandemics impact residents’ experiences of gun violence and provide important implications for gun violence prevention in a digital era.
@article{PATTON2022107263,title={Community implications for gun violence prevention during co-occurring pandemics; a qualitative and computational analysis study},journal={Preventive Medicine},pages={107263},year={2022},issn={0091-7435},doi={https://doi.org/10.1016/j.ypmed.2022.107263},url={https://www.sciencedirect.com/science/article/pii/S0091743522003127},author={Patton, Desmond U. and Aguilar, Nathan and Landau, Aviv Y. and Thomas, Chris and Kagan, Rachel and Ren, Tianai and Stoneberg, Eric and Wang, Timothy and Halmos, Daniel and Saha, Anish and Ananthram, Amith and McKeown, Kathleen},keywords={Gun violence, COVID-19, Black lives matter, Defund the police, Social media, Qualitative and computational analysis},}
Emphasizing Complementary Samples for Non-Literal Cross-Modal Retrieval
Christopher Thomas, and Adriana Kovashka
In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, 2022
@inproceedings{thomas2022emphasizing,title={Emphasizing Complementary Samples for Non-Literal Cross-Modal Retrieval},author={Thomas, Christopher and Kovashka, Adriana},booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops},pages={4632--4641},year={2022},}
arXiv
Multimodal Event Graphs: Towards Event Centric Understanding of Multimodal World
Hammad A Ayyubi, Christopher Thomas, Lovish Chum, and 7 more authors
@article{ayyubi2022multimodal,title={Multimodal Event Graphs: Towards Event Centric Understanding of Multimodal World},author={Ayyubi, Hammad A and Thomas, Christopher and Chum, Lovish and Lokesh, Rahul and Niu, Yulei and Lin, Xudong and Chen, Long and Koo, Jaywon and Ray, Sounak and Chang, Shih-Fu},journal={arXiv preprint arXiv:2206.07207},year={2022}}
TPAMI
Learning to Overcome Noise in Weak Caption Supervision for Object Detection
Mesut Erhan Unal, Keren Ye, Mingda Zhang, and 5 more authors
IEEE Transactions on Pattern Analysis and Machine Intelligence, 2022
@article{unal2022learning,title={Learning to Overcome Noise in Weak Caption Supervision for Object Detection},author={Unal, Mesut Erhan and Ye, Keren and Zhang, Mingda and Thomas, Christopher and Kovashka, Adriana and Li, Wei and Qin, Danfeng and Berent, Jesse},journal={IEEE Transactions on Pattern Analysis and Machine Intelligence},year={2022},publisher={IEEE}}
EMNLP
Weakly-Supervised Temporal Article Grounding
Long Chen, Yulei Niu, Brian Chen, and 6 more authors
In Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing (EMNLP2022), 2022
@inproceedings{chen2022weakly,title={Weakly-Supervised Temporal Article Grounding},author={Chen, Long and Niu, Yulei and Chen, Brian and Lin, Xudong and Han, Guangxing and Thomas, Christopher and Ayyubi, Hammad and Ji, Heng and Chang, Shih-Fu},booktitle={Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing (EMNLP2022)},year={2022}}
2021
InfoSurgeon: Cross-Media Fine-grained Information Consistency Checking for Fake News Detection
Yi Fung, Christopher Thomas, Revanth Reddy, and 6 more authors
In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics (ACL), 2021
@inproceedings{fung2021infosurgeon,title={InfoSurgeon: Cross-Media Fine-grained Information Consistency Checking for Fake News Detection},author={Fung, Yi and Thomas, Christopher and Reddy, Revanth and Polisetty, Sandeep and Ji, Heng and Chang, Shih-Fu and McKeown, Kathleen and Bansal, Mohit and Sil, Avi},booktitle={Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics (ACL)},year={2021},}
IJCV
Predicting Visual Political Bias Using Webly Supervised Data and an Auxiliary Task
@article{thomas2021predicting,title={Predicting Visual Political Bias Using Webly Supervised Data and an Auxiliary Task},author={Thomas, Christopher and Kovashka, Adriana},journal={International Journal of Computer Vision},volume={129},number={11},pages={2978--3003},year={2021},publisher={Springer US}}
EMNLP
Joint Multimedia Event Extraction from Video and Article
Brian Chen, Xudong Lin, Christopher Thomas, and 5 more authors
In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing (EMNLP2021) Findings, 2021
@inproceedings{chen2021joint,title={Joint Multimedia Event Extraction from Video and Article},author={Chen, Brian and Lin, Xudong and Thomas, Christopher and Li, Manling and Yoshida, Shoya and Chum, Lovish and Ji, Heng and Chang, Shih-Fu},booktitle={Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing (EMNLP2021) Findings},year={2021}}
2020
Preserving Semantic Neighborhoods for Robust Cross-modal Retrieval
Christopher Thomas, and Adriana Kovashka
In Proceedings of the European Conference on Computer Vision (ECCV), 2020
@inproceedings{thomas2020preserving,title={Preserving Semantic Neighborhoods for Robust Cross-modal Retrieval},author={Thomas, Christopher and Kovashka, Adriana},booktitle={Proceedings of the European Conference on Computer Vision (ECCV)},year={2020},}
Modeling Visual Rhetoric and Semantics in Multimedia
@phdthesis{thomas2020modeling,title={Modeling Visual Rhetoric and Semantics in Multimedia},author={Thomas, Christopher},year={2020},school={University of Pittsburgh}}
arXiv
Learning to Transfer Visual Effects from Videos to Images
Christopher Thomas, Yale Song, and Adriana Kovashka
@article{thomas2020learning,title={Learning to Transfer Visual Effects from Videos to Images},author={Thomas, Christopher and Song, Yale and Kovashka, Adriana},journal={arXiv preprint arXiv:2012.01642},year={2020}}
2019
Predicting the politics of an image using webly supervised data
Christopher Thomas, and Adriana Kovashka
In Advances in Neural Information Processing Systems (NeurIPS 2019), 2019
@inproceedings{thomas2019predicting,title={Predicting the politics of an image using webly supervised data},author={Thomas, Christopher and Kovashka, Adriana},booktitle={Advances in Neural Information Processing Systems (NeurIPS 2019)},year={2019},}
2018
Persuasive faces: generating faces in advertisements
Christopher Thomas, and Adriana Kovashka
In Proceedings of the British Machine Vision Conference, 2018
@inproceedings{thomas2018persuasive,title={Persuasive faces: generating faces in advertisements},author={Thomas, Christopher and Kovashka, Adriana},booktitle={Proceedings of the British Machine Vision Conference},year={2018},}
ACCV
Artistic object recognition by unsupervised style adaptation
@inproceedings{thomas2018artistic,title={Artistic object recognition by unsupervised style adaptation},author={Thomas, Christopher and Kovashka, Adriana},booktitle={Asian Conference on Computer Vision},pages={460--476},year={2018},organization={Springer, Cham}}
2017
Automatic understanding of image and video advertisements
Zaeem Hussain, Mingda Zhang, Xiaozhong Zhang, and 5 more authors
In Proceedings of the IEEE conference on computer vision and pattern recognition, 2017
@inproceedings{hussain2017automatic,title={Automatic understanding of image and video advertisements},author={Hussain, Zaeem and Zhang, Mingda and Zhang, Xiaozhong and Ye, Keren and Thomas, Christopher and Agha, Zuha and Ong, Nathan and Kovashka, Adriana},booktitle={Proceedings of the IEEE conference on computer vision and pattern recognition},pages={1705--1715},year={2017},}
2016
Seeing Behind the Camera: Identifying the Authorship of a Photograph
Christopher Thomas, and Adriana Kovashka
In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2016
@inproceedings{thomas2016seeing,title={Seeing Behind the Camera: Identifying the Authorship of a Photograph},author={Thomas, Christopher and Kovashka, Adriana},booktitle={Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition},year={2016},}
arXiv
Opensalicon: An open source implementation of the salicon saliency model
@article{thomas2016opensalicon,title={Opensalicon: An open source implementation of the salicon saliency model},author={Thomas, Christopher},journal={arXiv preprint arXiv:1606.00110},year={2016}}
CVPRW
A Visual Attention Algorithm Designed for Coupled Oscillator Acceleration
Christopher Thomas, Adriana Kovashka, Donald Chiarulli, and 1 more author
In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, 2016
@inproceedings{thomas2016visual,title={A Visual Attention Algorithm Designed for Coupled Oscillator Acceleration},author={Thomas, Christopher and Kovashka, Adriana and Chiarulli, Donald and Levitan, Steven},booktitle={Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops},pages={10--18},year={2016}}
2015
arXiv
Hand Posture’s Effect on Touch Screen Text Input Behaviors: A Touch Area Based Study
@article{thomas2015hand,title={Hand Posture's Effect on Touch Screen Text Input Behaviors: A Touch Area Based Study},author={Thomas, Christopher and Jennings, Brandon},journal={arXiv preprint arXiv:1504.02134},year={2015}}
SEKE
Application of Slow Intelligence Framework for Smart Pet Care System Design
Shi-Kuo Chang, Wen-Hui Chen, Wen-Chyi Lin, and 1 more author
In Software Engineering and Knowledge Engineering (SEKE 2015), 2015
@inproceedings{chang2015application,title={Application of Slow Intelligence Framework for Smart Pet Care System Design},author={Chang, Shi-Kuo and Chen, Wen-Hui and Lin, Wen-Chyi and Thomas, Christopher Lee},booktitle={Software Engineering and Knowledge Engineering (SEKE 2015)},year={2015}}
2014
INLG
TBI-Doc: Generating patient & clinician reports from brain imaging data
Pamela Jordan, Nancy Green, Chistopher Thomas, and 1 more author
In Proceedings of the 8th International Natural Language Generation Conference (INLG), 2014
@inproceedings{jordan2014tbi,title={TBI-Doc: Generating patient \& clinician reports from brain imaging data},author={Jordan, Pamela and Green, Nancy and Thomas, Chistopher and Holm, Susan},booktitle={Proceedings of the 8th International Natural Language Generation Conference (INLG)},pages={143--146},year={2014}}
@misc{myers2014student,title={Student Response Analysis},author={Myers, Sean and Parenti, Timothy and Thomas, Chris},journal={Technical Report - University of Pittsburgh Department of Computer Science},year={2014}}