ReproZip: Using provenance to support computational reproducibility

Fernando Chirigati, Dennis Shasha, Juliana Freire

Research output: Contribution to conferencePaperpeer-review

Abstract

We describe ReproZip, a tool that makes it easier for authors to publish reproducible results and for reviewers to validate these results. By tracking operating system calls, ReproZip systematically captures detailed provenance of existing experiments, including data dependencies, libraries used, and configuration parameters. This information is combined into a package that can be installed and run on a different environment. An important goal that we have for ReproZip is usability. Besides simplifying the creation of reproducible results, the system also helps reviewers. Because the package is self-contained, reviewers need not install any additional software to run the experiments. In addition, ReproZip generates a workflow specification for the experiment. This not only enables reviewers to execute this specification within a workflow system to explore the experiment and try different configurations, but also the provenance kept by the workflow system can facilitate communication between reviewers and authors.

Original languageEnglish (US)
StatePublished - 2013
Event5th Workshop on the Theory and Practice of Provenance, TaPP 2013 - Lombard, United States
Duration: Apr 2 2013Apr 3 2013

Conference

Conference5th Workshop on the Theory and Practice of Provenance, TaPP 2013
CountryUnited States
CityLombard
Period4/2/134/3/13

ASJC Scopus subject areas

  • Computer Science(all)

Fingerprint Dive into the research topics of 'ReproZip: Using provenance to support computational reproducibility'. Together they form a unique fingerprint.

Cite this