Paper on Artifact Evaluations accepted at FSE 2020

To foster replicable research, many conferences encourage the submission of ‘research artifacts’ along with papers. Artifacts essentially are anything that has contributed to generating the results presented in a research article. The submitted artifacts are evaluated by dedicated artifact evaluation committees (AECs). I’ve had the pleasure to serve on such AECs myself (ISSTA 2016 and ISSTA 2018) and we also received a positive evaluation for our artifact at ISSTA 2019.

One thing that I’ve been struggling with a bit is what makes a good artifact and I’ve had long and interesting discussions with Ben Hermann, who co-chaired the ISSTA 2018 AEC, on this topic. We realized that our views on artifact quality slightly differed in various nuances, which led us to the question if the perceptions of artifact quality generally differ across AEC members and what the causes and possible impact of such different perceptions are.

With the help of Janet Siegmund, who is an expert in qualitative surveys and a fantastic discussion partner when it comes to research methodology in general, we designed a questionnaire and invited all past AEC members from AECs at software engineering and programming language conferences to tell us about their perceptions of artifact purposes and quality.

The paper that discusses the results from this survey has just been accepted at FSE 2020, which is the best venue for the paper that I can imagine. FSE has pioneered artifact evaluations in the software engineering community and has been conducting these evaluations for almost a decade.

We thank all the anonymous participants of our study, the anonymous reviewers of our paper, and the AEC members that currently evaluate our research artifact. We hope our paper contributes to the continuous improvement of artifact evaluations and replicable research in general. A preprint of our paper and its research artifact are available here.

TraceSanitizer Paper at DSN 2020

Our paper on sanitizing execution traces from effects of benign execution non-determinism has been accepted at DSN’20.

The paper addresses a problem we frequently came across in Error Propagation Analyses (EPA) using fault injections. In EPA execution traces are commonly used as an auxiliary oracle: Execution traces under fault injection are compared to execution traces from fault-free runs and if they deviate that’s an indicator for error propagation (for a more detailed discussion of this usage, please see our ASE’17 paper on TrEKer).

This trace comparison does not work reliably under benign excution non-determinism, i.e., when operating systems (OSs), run-times, and libraries have the freedom to alter program execution in order to achieve better performance, as long as it does not affect the outcome of the execution. A prominent example for this is thread scheduling. Assuming the program is race free, it does not matter which thread is scheduled for execution when. The outcome is the same and the OS can prevent, for instance, threads that are waiting for I/O from blocking the CPU.

The problem this causes for EPA is that the execution traces can deviate, even if there is no effect from a fault. Even in consecutive fault-free runs, there will be deviations, because the execution order of instructions from different threads can differ. In our paper we solve this problem for an important class of programs that we term pseudo-deterministic and for which conflicting accesses to shared data (operations from different threads, at least one of which is a write operation) must always occur in the same order. An earlier approach to solve the same problem, which we presented at ICST’17, is applicable to a wider range of programs, but (contrary to TraceSanitizer) may lead to false positives in EPA.

To decide whether a program is pseudo-deterministic and TraceSanitizer can be applied, we introduce an automated check based on SMT solver supported maximal causal reasoning on fault-free traces. If the check passes, we sanitize execution traces (from fault-free runs and fault injections) by eliminating effects from both non-deterministic thread scheduling and dynamic memory allocations.

The paper can be found here. The TraceSanitizer prototype implementation, which has been developed for LLFI, is available on github.

ISSRE 2020

I have the pleasure to serve on the ISSRE PC again this year. Please submit some great papers for us to review! 🙂 Here are the dates for the research and industry tracks:

Research Track

  • Abstract submission deadline: May 18th
  • Full paper submission deadline: May 25th
  • Author rebuttal period:  July 16th-18th
  • Notification to authors: July 30th
  • Camera-ready papers: August 20th

Industry Track

  • Deadline for contributions (full and short papers): July 20th
  • Notification to authors: August 21st
  • Camera ready papers: August 28th

For more information, please refer to http://2020.issre.net/

Paper on Performance Bugs at ISSRE 2019

Yesterday we presented our paper “Inferring Performance Bug Patterns from Developer Commits” at ISSRE 2019 in Berlin.

“Performance bugs” denote unnecessarily slow running code and have recently been the focus of many research articles in the software engineering community. In our paper we report on a manual investigation and classification of 733 performance bug inducing developer commits from 13 popular open source projects written in C or C++. In total, we extract 7 common performance bug patterns from these commits and analyze their differences in terms of bug complexity by looking at how long it takes before bugs are found and fixed, the seniority of the fixing developer, and how many lines the bug fixing commit comprises.

We hope that the results from our study will benefit those who are working on performance bug detection and localization techniques. An author copy of our paper can be found here.

The data we based our study on is publicly available on the project’s gitlab site. We appreciate any comments on this work and are happy to support further usage and enhancements of this data set.