I am an Assistant Professor of Software Engineering in the Department of Computer Science at Virginia Tech. You can find me in my office at 2217 Knowledge Works II.

My research focuses on software development productivity and software quality.

I use software evolution analysis and program analysis to create practical, efficient, and human-friendly techniques and tools that provide automatic support for all stages of software development. My research interests also include mining of software repositories, program comprehension, and software visualization.

### I’m looking for students!

Do you want to do research in software engineering? I am actively recruiting students at all levels to do research in my lab.

## News

2019 Our SIGCSE ‘19 paper obtained the Second Best Paper Award.
2018 Our FSE ‘18 paper obtained an ACM Distinguished Paper Award.
2018 Paper accepted to SIGCSE ‘19: “Assessing Incremental Testing Practices and Their Impact on Project Outcomes”
2018 Paper accepted to FSE ‘18: “The Impact of Regular Expression Denial of Service (ReDoS) in Practice: an Empirical Study at the Ecosystem Scale”
2018 Poster accepted to ICSE ‘18: “Poster: Understanding and Leveraging Developer Inexpertise”
2018 Paper accepted to MSR ‘18 Mining Challenge Track: “The Hidden Cost of Code Completion: Understanding the Impact of the Recommendation-list Length on its Efficiency”
2017 Paper accepted to ICSE ‘17: “Fuzzy Fine-grained Code-history Analysis”
2017 Paper accepted to MSR ‘17 Mining Challenge Track: “An Empirical Study of Activity, Popularity, Size, Testing, and Stability in Continuous Integration”

## Join My Lab

### Are you interested in software engineering research?

I am actively recruiting students. My research interests include, among other topics, software development productivity, software quality, mining of software repositories, program comprehension, and software visualization.

If you are interested in working with me, send me an email with the following information:

• Why do you want to do research with me? Why are you interested in doing research? Do you share my research interests? Did any of my papers inspire you with some new ideas?

• Are you interested in graduate school? Do you want to pursue a Ph.D., a Master’s thesis, or do you just want to pursue an interesting research project?

• What are your skills? Send me your C.V. Are you a great programmer? Send me something you coded. Are you a strong writer? Send me a paper that you wrote.

## Biography

Francisco Servant is an Assistant Professor in the Department of Computer Science at Virginia Tech. His research focuses on software development productivity and software quality.

Francisco uses software evolution analysis and program analysis to create practical, efficient, and human-friendly techniques and tools that provide automatic support for all stages of software development. His research interests include software development productivity, software quality, mining of software repositories, program comprehension, and software visualization. He has published articles in these areas at top software engineering conferences (e.g., ICSE, FSE, ASE) and he has performed research for large technology companies, such as Microsoft Research and DreamWorks Animation.

Francisco received a Ph.D. in Software Engineering from the University of California, Irvine, under the supervision of James A. Jones. He also holds a M.S. in Information and Computer Sciences from the same university supervised by André van der Hoek. Francisco obtained his B.S. in Computer Science from the University of Granada in Spain.

## Research

### Assessing Incremental Testing Practices and Their Impact on Project Outcomes

Ayaan M. Kazerouni, Clifford A. Shaffer, Stephen H. Edwards, and Francisco Servant, “Assessing Incremental Testing Practices and Their Impact on Project Outcomes”. Proceedings of the 2019 ACM SIGCSE Technical Symposium on Computer Science Education (SIGCSE ‘19), Minneapolis, MN, USA, March 2019, to appear. SIGCSE’19 2nd Best Paper Award

We present a family of novel metrics for assessment of testing practices for increments of software development work, thus allowing early feedback before the software project is finished. Our metrics measure the balance and sequence of effort spent writing software tests in a work increment. We performed an empirical study using our metrics to evaluate the test-writing practices of 157 advanced undergraduate students, and their relationships with project outcomes over multiple projects for a whole semester. We found that projects where more testing effort was spent per work session tended to be more semantically correct and have higher code coverage. The percentage of method-specific testing effort spent before production code did not contribute to semantic correctness, and had a negative relationship with code coverage. These novel metrics will enable educators to give students early, incremental feedback about their testing practices as they work on their software projects.

### The Impact of Regular Expression Denial of Service (ReDoS) in Practice: an Empirical Study at the Ecosystem Scale

James C. Davis, Christy A. Coghlan, Francisco Servant, and Dongyoon Lee, “The Impact of Regular Expression Denial of Service (ReDoS) in Practice: an Empirical Study at the Ecosystem Scale”. Proceedings of the 26th ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering (ESEC/FSE’18), Lake Buena Vista, FL, USA, November 2018, pp. 246–256. Acceptance Rate: 19%. ACM SIGSOFT Distinguished Paper Award .

In this paper, we empirically study three major aspects of ReDoS that have, surprisingly, been hitherto unexplored: the incidence of ReDoS in practice, how ReDoS can be prevented, and how vulnerable regexes can be fixed. After studying the ecosystems of two of the most popular programming languages — JavaScript and Python — we found that ReDoS vulnerabilities are a larger threat in practice than might have been guessed. We detected thousands of vulnerabilities affecting over 10,000 modules over diverse application domains. We also found that conventional-wisdom ReDoS anti-patterns are mostly necessary, but not sufficient to signal vulnerable regexes. Finally, we found that developers favor revising the regex over truncating input or developing a custom parser, regardless of whether they had been shown examples of all three fix strategies. These findings show that careful attention should be put into addressing ReDoS vulnerabilities because they affect a large amount of projects and we still have very limited techniques to prevent them and fix them.

### POSTER: Understanding and Leveraging Developer Inexpertise

Lykes Claytor, Francisco Servant, “POSTER: Understanding and Leveraging Developer Inexpertise”. Proceedings of the 40th International Conference on Software Engineering (ICSE 2018), Poster Track, Gothenburg, Sweden, May 2018, pp. 404–405.

Existing work in modeling developer expertise assumes that developers reflect their expertise in their contributions and that such expertise can be analyzed to provide support for developer tasks. However, developers also make contributions in which they reflect their inexpertise such as by making mistakes in their code. We refine the hypotheses of the expertise-identification literature by proposing developer inexpertise as a factor that should be modeled to automate support for developer tasks.

### The Hidden Cost of Code Completion: Understanding the Impact of the Recommendation-list Length on its Efficiency

Xianhao Jin, Francisco Servant, “The Hidden Cost of Code Completion: Understanding the Impact of the Recommendation-list Length on its Efficiency”. Proceedings of the 15th International Conference on Mining Software Repositories (MSR 2018), Mining Challenge Track, Gothenburg, Sweden, May 2018, pp. 70–73.

Automatic code completion is a useful and popular technique that software developers use to write code more effectively and efficiently. However, while the benefits of code completion are clear, its cost is yet not well understood. We hypothesize the existence of a hidden cost of code completion, which mostly impacts developers when code completion techniques produce long recommendations. We study this hidden cost of code completion by evaluating how the length of the recommendation list affects other factors that may cause inefficiencies in the process. We study how common long recommendations are, whether they often provide low-ranked correct items, whether they incur longer time to be assessed, and whether they were more prevalent when developers did not select any item in the list. In our study, we observe evidence for all these factors, confirming the existence of a hidden cost of code completion.

### Fuzzy Fine-grained Code-history Analysis

Francisco Servant, James A. Jones, “Fuzzy Fine-grained Code-history Analysis”. Proceedings of the 39th International Conference on Software Engineering (ICSE 2017), Buenos Aires, Argentina, May 2017, pp. 746–757. Acceptance Rate: 16%

[Download replication package] Please, cite this paper if you find the implementation useful.

The true evolutionary lineage of code is often complex, subjective, and ambiguous. As such, existing techniques are predisposed to, both, overestimate and underestimate true evolution lineage. This new technique addresses these issues by providing a more expressive model of code evolution — the fuzzy history graph — and a novel multi-revision code-history analysis — fuzzy history slicing — by representing code lineage as a continuous (i.e., fuzzy) metric rather than a discrete (i.e., absolute) one. We found that fuzzy history slicing provides a tunable balance of precision and recall, an overall improved accuracy over existing code-evolution models, and improved accuracy for code-history analysis tasks.

### An Empirical Study of Activity, Popularity, Size, Testing, and Stability in Continuous Integration

Aakash Gautam, Saket Vishwasrao, Francisco Servant, “An Empirical Study of Activity, Popularity, Size, Testing, and Stability in Continuous Integration”. Proceedings of the 14th International Conference on Mining Software Repositories (MSR 2017), Mining Challenge Track, Buenos Aires, Argentina, May 2017, pp. 495–498.

A good understanding of the practices followed by software development projects can positively impact their success — particularly for attracting talent and on-boarding new members. In this paper, we perform a cluster analysis to classify software projects that follow continuous integration in terms of their activity, popularity, size, testing, and stability. Based on this analysis, we identify and discuss four different groups of repositories that have distinct characteristics that separates them from the other groups. With this new understanding, we encourage open source projects to acknowledge and advertise their preferences according to these defining characteristics, so that they can recruit developers who share similar values.

### Supporting Bug Investigation using History Analysis

Francisco Servant, “Supporting Bug Investigation using History Analysis”. Proceedings of the 28th IEEE/ACM International Conference on Automated Software Engineering, Doctoral Symposium Track (ASE 2013), Silicon Valley, California, November 2013, pp. 754–757.

In this paper, I propose a line of research that is based on an automated technique to support bug investigation by using a novel analysis of the history of the source code. During the bug-fixing process, developers spend a high amount of manual effort investigating the bug in order to answer a series of questions about it. This line of research pursues to support developers in answering the following questions about a bug: Who is the most suitable developer to fix the bug?, Where is the bug located?, When was the bug inserted? and Why was the bug inserted?

### Chronos: Visualizing Slices of Source-Code History

Francisco Servant, James A. Jones, “Chronos: Visualizing Slices of Source-Code History”. Proceedings of the 1st IEEE Working Conference on Software Visualization, Tool Track (VISSOFT 2013), Eindhoven, Netherlands, September 2013, pp. 1–4.

In this paper, we present Chronos, a tool that enables the querying, exploration, and discovery of historical change events to source code. Unlike traditional Revision-Control-System tools, Chronos allows queries across any subset of the code, down to the line-level, which can potentially be contiguous or disparate, even among multiple files. In addition, Chronos provides change history across all historical versions (i.e. it is not limited to a pairwise “diff”). Chronos supports pattern recognition and discovery, and a low-level view that supports semantic comprehension for tasks such as reverse engineering and identifying design rationale.

### History Slicing: Assisting Code-Evolution Tasks

Francisco Servant, James A. Jones, “History Slicing: Assisting Code-Evolution Tasks”. Proceedings of the 20th International Symposium on Foundations of Software Engineering (FSE 2012), Research Triangle Park, NC, USA, November 2012, pp. 43:1–43:11. Acceptance Rate: 16.9%

[Download implementation] Please, cite this paper if you find the implementation useful.

In this paper, we present an approach called history slicing that can automatically identify a minimal number of code modifications, across any number of revisions, for any arbitrary segment of source code at fine granularity. Many software-engineering tasks require developers to understand the history and evolution of source code. We also present our implementation of history slicing, Chronos, that includes a novel visualization of the entire evolution for the code of interest. Finally, we provide two experiments that show that history slicing offered drastic improvements over the conventional techniques in three ways: (1) the amount of information needed to be examined and traced by developers was reduced by up to three orders of magnitude; (2) the correctness of developers attempting to solve software-maintenance tasks was more than doubled; and (3) the time to completion of these software-maintenance tasks was almost halved.

### WhoseFault: Automatic Developer-to-Fault Assignment through Fault Localization

Francisco Servant, James A. Jones, “WhoseFault: Automatic Developer-to-Fault Assignment Through Fault-Localization”. Proceedings of the 34th International Conference on Software Engineering (ICSE 2012), Zurich, Switzerland, June 2012, pp. 36–46. Acceptance Rate: 21%

Our results show that 81% of the time, WhoseFault produced the same developer that actually fixed the fault within the top three suggestions. We also show that our technique improved by a difference between 4% and 40% the results of a baseline technique. Finally, we explore the influence of each of the three components of our technique over its results.This paper describes a new technique, which automatically selects the most appropriate developers for fixing the fault represented by a failing test case, and provides a diagnosis of where to look for the fault. To our knowledge, this technique is the first to assign developers to execution failures, without the need for textual bug reports.

### History Slicing

Francisco Servant, James A. Jones, “History Slicing”. Proceedings of the 26th IEEE/ACM International Conference on Automated Software Engineering (ASE 2011), Lawrence, Kansas, USA, November 2011, pp. 452–455. Acceptance Rate: 37%

Multiple software engineering tasks, such as program understanding or assessment of developer expertise, benefit from using the history of source code. However, determining the evolution of a set of lines of code is a manual and time consuming process. This paper presents a model of this process and an approach for automating it. We also present some preliminary results which show initial evidence that our automated technique can be several orders of magnitude faster than a manual approach and require that developers examine up to two orders of magnitude less code in extracting such histories.

### CASI: Preventing Indirect Conflicts through a Live Visualization

Francisco Servant, James A. Jones, André van der Hoek, “CASI: Preventing Indirect Conflicts through a Live Visualization”. Proceedings of the Workshop on Cooperative and Human Aspects of Software Engineering (CHASE 2010), Cape Town, South Africa, May 2010, pp. 39–46.

Indirect conflicts are produced when multiple developers perform changes in parallel to source code entities that depend on each other. CASI informs developers in real time of the changes that are taking place in a software project and the source code entities influenced by them. It visualizes this influence together with directionality and severity information to help developers decide whether a concrete situation represents an indirect conflict.

### POSTER: Lighthouse - A Coordination Platform Based on Emerging Design

Tiago Proenca, Francisco Servant, Nilmax Moura, André van der Hoek. Poster & Demo, ISR Research Forum 2009, Irvine, California, USA, June 2009, poster.

Despite the fact that software development is an inherently collaborative activity, a great deal of software development is spent with developers in isolation, working on their own parts of the system. In these situations developers are unaware of parallel changes being made by others, often resulting in conflicts. One approach to deal with this issue is called conflict resolution, which means that changes have been checked-in and developers must use tools to resolve conflicts and retest the code to ensure its correctness. Unfortunately, this process becomes more difficult the longer the conflicts go undetected. To address these issues, we are proposing a conflict avoidance approach that helps to find conflicts as soon as they occur. Lighthouse is an Eclipse plug-in that brings this approach to developers by utilizing a concept called emerging design, an up to date design representation of the code, alerting developers of potentially conflicting implementation changes as they occur.

## Service

### 2019

• PC member, International Conference on Software Engineering (ICSE), NIER Track

### 2018

• Reviewer, NSF Panel
• Reviewer, IEEE Software
• Reviewer, Journal of Systems and Software (JSS)
• Reviewer, IEEE Transactions on Software Engineering journal (TSE)
• PC member, International Symposium on the Foundations of Software Engineering (FSE), NIER Track
• PC member, International Conference on Software Maintenance and Evolution (ICSME), Artifacts track
• PC member, International Conference on Software Maintenance and Evolution (ICSME)
• PC member, International Conference on Program Comprehension (ICPC), Industry Track
• PC member, International Conference on Program Comprehension (ICPC), Tool-Demonstration Track

### 2017

• Reviewer, IEEE Transactions on Software Engineering journal (TSE)

### 2016

• Reviewer, FRQNT Panel
• Reviewer, IEEE Transactions on Software Engineering journal (TSE)

### 2015

• Reviewer, Journal of Internet Services and Applications (JISA)
• Reviewer, Journal of Systems and Software (JSS)

### 2014

• External reviewer, International Symposium on the Foundations of Software Engineering (FSE)
• External reviewer, International Conference on Software Engineering (ICSE)

### 2013

• Reviewer, Central European Journal of Computer Science (CEJCS)
• External reviewer, International Conference on Software Engineering (ICSE)
• External reviewer, Working Conference on Software Visualization, Tool Track (VISSOFT)
• External reviewer, Working Conference on Software Visualization, NIER Track (VISSOFT)

### 2012

• PC member, International Conference on Program Comprehension (ICPC), Tool-Demonstration Track
• PC member, International Working Conference on Mining Software Repositories (MSR), Mining Challenge Track
• External reviewer, International Conference on Software Engineering (ICSE)