Muhammad Ali Gulzar
Assistant Professor in Computer Science
I am an assistant professor in the Computer Science Department at Virginia Tech. I am also an Amazon Visiting Academic at Amazon Web Services. I received my Ph.D. in Computer Science at the University of California, Los Angeles where I was a Google Ph.D. Fellow 2017-20.
My research vision is to build systems that improve developer productivity through automated debugging and testing for applications in the emerging domains, including data-intensive software such as dataflow programs, ML/AI applications, and scientific analysis software such as computations notebooks. Under these broader goals, I redesign existing software productivity tools for emerging applications in three areas. I am interested in (1) automated tracking-code localization techniques in web applications, (2) re-engineering testing and debugging for data-intensive applications, and (3) advancing current testing and debugging practices in Federated Learning Applications.
My past work has focused on interactive and automated debugging for Apache Spark, symbolic execution based test generation for dataflow programs, and performance debugging in Apache Spark.
News
My student,Haddi, co-authored the 2024 Web Almanac’s Privacy Chapter. | |
Our research on rare-path coverage and evidence-based tech hirring are accepted to SANER 2025. | |
Our work on using neuron provenance to identify responsible clients in FL is accepted to ICSE 2025. Congrats, Waris! | |
Our work on web ads decreasing the accessibility of web pages is accepted to ICSE 2025. Congrats, Haddi! | |
Our work on blocking JS tracking functions received the ACM CCS 2024 Distinguished Artifact Award. Congrats, Haddi! | |
I received the 2024-25 Amazon-VT Award for our work on Semantic Cache for LLMs. | |
Our project on transparency and accessibility issues in web ads engineering is funded by CCI. | |
Our work on auto-generating privacy-enchancing JS surrogates is accepted to CCS 2024! | |
Our proposal on integrated forensic logging is funded by 4-VA. | |
New work on Natural Test Generation with Symbolic Execution is accepted at FSE 2024. | |
Older news |
Publications
2025
- [ICSE 2025] Accessibility Issues in Ad-Driven Web ApplicationsThe 47th IEEE/ACM International Conference of Software Engineering. 2025
- [ICSE 2025] TraceFL: Interpretability-Driven Debugging in Federated Learning via Neuron ProvenanceThe 47th IEEE/ACM International Conference of Software Engineering. 2025
- [SANER 2025] A Metric for Measuring the Impact of Rare Paths on Program CoverageThe IEEE International Conference on Software Analysis, Evolution and Reengineering. 2025
- [SANER 2025] Improving Evidence-Based Tech Hiring with GitHub-Supported Resume MatchingThe IEEE International Conference on Software Analysis, Evolution and Reengineering. 2025
2024
- The ACM International Conference on the Foundations of Software Engineering. 2024
- The ACM International Conference on the Foundations of Software Engineering. 2024
- [CCS 2024] How Do Visually Impaired Users Navigate Accessibility Challenges in an Ad-Driven WebProceedings of the 2024 on ACM SIGSAC Conference on Computer and Communications Security. Poster Track 2024
2023
- [ASE 2023] NaturalFuzz: Natural Input Generation for Big Data AnalyticsThe 38th IEEE/ACM International Conference on Automated Software Engineering. 2023
- ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering. 2023
- The ACM/IEEE 45th International Conference on Software Engineering 2023
- Proceedings on Privacy Enhancing Technologies Symposium 2023
- [SE4SafeML 2023] FedDefender: Backdoor Attack Defense in Federated LearningProceedings of the 1st International Workshop on Dependability and Trustworthiness of Safety-Critical Systems with Machine Learned Components 2023
2022
- [ASE 2022] Detecting Build Conflicts in Software Merge for Java Programs via Static AnalysisThe 37th IEEE/ACM International Conference on Automated Software Engineering 2022
- [ACL 2022] Sibylvariant Transformations for Robust Text ClassificationIn 60th Annual Meeting of the Association for Computational Linguistics 202216 Pages.
2021
- [SOCC 2021] OptDebug: Fault-Inducing Operation Isolation for Dataflow ApplicationsIn The 12th ACM Symposium on Cloud Computing 202113 Pages. 30% Acceptance Rate
- [IMC 2021] TrackerSift: Untangling Mixed Tracking and Functional Web ResourcesIn Proceedings of the 2021 ACM Internet Measurement Conference 20218 Pages. 27.9% Acceptance Rate
- [HiPS 2021] Towards a Serverless Bioinformatics Cyberinfrastructure PipelineIn Proceedings of the 1st Workshop on High Performance Serverless Computing 20218 Pages. Workshop Paper.
2020
- [SOCC 2020] Influence-Based Provenance for Dataflow Applications with Taint PropagationIn The 11th ACM Symposium on Cloud Computing 202012 Pages. Full Paper. 24.4% Acceptance Rate
- [ASE 2020] BigFuzz: Efficient Fuzz Testing for Data Analytics using Framework AbstractionIn The 35th IEEE/ACM International Conference on Automated Software Engineering 202012 Pages. Full Paper. 22.5% Acceptance Rate
- In The 28th ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering 202012 Pages. Full Paper. 28.0% Acceptance Rate
- In 2020 IEEE/ACM 42nd International Conference on Software Engineering 202013 Pages. Full Paper. 20.9% Acceptance Rate
- [ICSE Demo 2020] BigTest: Symbolic Execution Based Systematic Test Generation Tool for Apache SparkIn Proceedings of the ACM/IEEE 42nd International Conference on Software Engineering: Companion Proceedings 20204 Pages. Demonstration Paper. 33.3% Acceptance Rate
2019
- [ESEC/FSE 2019] White-box Testing of Big Data Analytics with Complex User-defined FunctionsIn Proceedings of the 2019 27th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering 201912 Pages. Full Paper. 24.4% Acceptance Rate
- [SoCC 2019] PerfDebug: Performance Debugging of Computation Skew in Dataflow SystemsIn Proceedings of the 2019 Symposium on Cloud Computing 201912 Pages. Full Paper. 24.8% Acceptance Rate
- [ICSE SEIP 2019] Perception and Practices of Differential TestingIn Proceedings of the 41st International Conference on Software Engineering: Software Engineering in Practice 201910 Pages. Full Paper. 22.2% Acceptance Rate
2018
- [VLDB Journal 2018] Adding Data Provenance Support to Apache SparkThe VLDB Journal 201821 Pages. VLDB Journal Paper.
- [ESEC/FSE Demo 2018] BigSift: Automated Debugging of Big Data Analytics in Data-intensive Scalable ComputingIn Proceedings of the 2018 26th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering 20184 Pages. Demonstration Paper. 38.8% Acceptance Rate
2017
- [SoCC 2017] Automated Debugging in Data-intensive Scalable ComputingIn Proceedings of the 2017 Symposium on Cloud Computing 201715 Pages. Full Paper. 23.6% Acceptance Rate
- [SIGMOD Demo 2017] Debugging Big Data Analytics in Spark with BigDebugIn Proceedings of the 2017 ACM International Conference on Management of Data 20174 Pages. Demonstration Paper. 34% Acceptance Rate
2016
- [ICSE 2016] BigDebug: Debugging Primitives for Interactive Big Data Processing in SparkIn 2016 IEEE/ACM 38th International Conference on Software Engineering 201612 Pages. Full Paper. 19.1% Acceptance Rate
- [SoCC 2016] Optimizing Interactive Development of Data-Intensive ApplicationsIn Proceedings of the Seventh ACM Symposium on Cloud Computing 201613 Pages. Full Paper. 25.1% Acceptance Rate
- [HotCloud 2016] Interactive Debugging for Big Data AnalyticsIn 8th USENIX Workshop on Hot Topics in Cloud Computing (HotCloud 16) 20167 Pages. Workshop Paper. 30.8% Acceptance Rate
- [ESEC/FSE Demo 2016] BigDebug: Interactive Debugger for Big Data Analytics in Apache SparkIn Proceedings of the 2016 24th ACM SIGSOFT International Symposium on Foundations of Software Engineering 20165 Pages. Demonstration Paper. 40.1% Acceptance Rate
2015
Funding
No news so far...