About me

I'm a scientific computer scientist. I like to solve interesting problems with software. My principal interests lie in program correctness (bugs!) and programmability.

From 2012-2015 I was a full-time software tester at IBM. I worked on IBM's General Parallel File System (GPFS), now rebranded as IBM Spectrum Scale. I focused on the ways in which a parallel file system can fail, with an emphasis on data validation. I'm still employed by IBM and return for summer work.

In the fall of 2015 I started working on a PhD at Virginia Tech in Computer Science. My advisor is Dr. Dongyoon Lee and we study a variety of systems and security problems, most recently those in event-driven systems like Node.js applications.

Research


Research interests


During my years as a software tester it became increasingly clear to me that the struggles of developers to write correct code are important. My research interests revolve around the core issues underlying software development:

  • "It's hard to write software": the creation of tools to improve the creation of correct programs by novices and experts alike.
  • "It's hard to write correct software": the creation of tools to automatically identify correctness issues in programs, e.g. data races.
  • "It's hard to write secure software": the creation of tools to automatically identify security flaws in programs.

Projects


Node.fz


In Node.fz, my first research project, I studied bugs in event-driven programming environments, with a focus on the Node.js server-side JavaScript framework. There are two main classes of Node.js developers: "web guys" without a lot of experience on the server side, and "systems guys" without a lot of experience in JavaScript. Members of both classes struggle to write correct Node.js programs due to their knowledge gaps. In the Node.fz project I study bug patterns in Node.js modules and applications, and tools to help Node.js developers improve the quality of their code. We have a paper accepted to EuroSys 2017, link forthcoming.


Adventures in Linux Scheduling


In Spring 2017 I took Dr. Pierre Olivier's course in Linux kernel programming. I worked on a project to deploy an alternative scheduler Xinwei Fu and Jingoo Han. It was a hierarchical multi-level feedback queue, and our feedback mechanism was so fine-grained that the system wasn't particularly usable. You can amuse yourself with our report here.

Cloud computing comparison


In Spring 2016 I took Dr. Ali Butt's course in cloud computing. I worked on a project to compare cloud service providers (AWS, Google, and Azure) with Uday Ananth and Ayaan Kazerouni. We performed a qualitative study of usability, reliability, and customer service, and a quantitative study of node performance. You can read our report here.

Invention


Detection of File Corruption in a Distributed File System


While at IBM a colleague and I filed a US patent on the detection of file corruption in a distributed file system. As software testers, we were responsible for identifying everything that was wrong with GPFS, across the entire scope of the product. We opened defects against poor command-line interfaces, inadequate error messages, inappropriate syslog verbosity, poor performance, and everything else you can imagine. Our favorite defects, however, related to correctness errors


When you write data ABC into a file system, you expect to read data ABC back. In a file system as complex as GPFS, however, it's not uncommon early on in the test cycle to get something else instead. You might get AB, in which case the file system has inappropriately truncated your data. You might get ABC old data, in which case the file system probably left "old" data in the file and failed to wipe it properly before giving it to you. You might even get AXC, in which case a write X to another file has mysteriously ended up in your file. In all of these cases it's appropriate to open a defect under the (highly-visible) category of Silent Data Corruption, and once reported these defects are addressed promptly; IBM is not in the habit of releasing code that corrupts its customers' data.


Typically, test programs rely on a checksum to detect file corruption. A checksum is a hash of a file's contents, typically of a fixed length. You can compute checksums in a few different ways, including the sysutils sum, cksum, md5sum, sha1sum, sha512sum. You might notice a theme in the names. Suppose that you have a 10MB file and you record its cksum every time you make modifications to it. When you read it, you re-calculate the cksum and compare it to the recorded cksum. If there's a mismatch, you've detected (a bug in your test program, or) corruption. Unfortunately, developers need a lot more information than just knowing that "the file content isn't right!" in order to debug a problem like this.


Once the patent is approved, I'll be happy to fill in the details of our design. Until then, enjoy pondering the problem.

Blog

Like all good graduate students, I maintain a blog on matters technical and personal.

You can visit it here.

Guide for new students

I'm working on a guide for new students.

Systems Reading Group

I run the CS department's Systems Reading Group. We meet weekly over pastries to discuss papers and practice talks.

Education

2015-?

PhD Computer Science (in progress)

Virginia Tech

2012

B.S. Computer Science

Clarkson University

2012

B.S. Mathematics

Clarkson University

Awards

2017

Pratt Fellowship

Departmental Award

Virginia Tech

2014

New Hire of the Month Award

For outstanding contributions to the team

IBM Poughkeepsie, NY

2012

Clarkson Award

Two outstanding graduating students university-wide

Clarkson University

Contact


  • Jamie Davis
  • Knowledge Works II
  • 2202 Kraft Dr.
  • Blacksburg, VA 24060

  • davisjam [AT] vt [DOT] edu