Fine-grained provenance collection over scripts through program slicing

João Felipe Pimentel, Juliana Freire, Leonardo Murta, Vanessa Braganholo

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Collecting provenance from scripts is often useful for scientists to explain and reproduce their scientific experiments. However, most existing automatic approaches capture provenance at coarse-grain, for example, the trace of user-defined functions. These approaches lack information of variable dependencies. Without this information, users may struggle to identify which functions really influenced the results, leading to the creation of false-positive provenance links. To address this problem, we propose an approach that uses dynamic program slicing for gathering provenance of Python scripts. By capturing dependencies among variables, it is possible to expose execution paths inside functions and, consequently, to create a provenance graph that accurately represents the function activations and the results they affect.

Original languageEnglish (US)
Title of host publicationProvenance and Annotation of Data and Processes - 6th International Provenance and Annotation Workshop, IPAW 2016, Proceedings
EditorsBoris Glavic, Marta Mattoso
PublisherSpringer Verlag
Pages199-203
Number of pages5
ISBN (Print)9783319405926
DOIs
StatePublished - 2016
Event6th International Provenance and Annotation Workshop, IPAW 2016 - McLean, United States
Duration: Jun 7 2016Jun 8 2016

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume9672
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Other

Other6th International Provenance and Annotation Workshop, IPAW 2016
Country/TerritoryUnited States
CityMcLean
Period6/7/166/8/16

ASJC Scopus subject areas

  • Theoretical Computer Science
  • General Computer Science

Fingerprint

Dive into the research topics of 'Fine-grained provenance collection over scripts through program slicing'. Together they form a unique fingerprint.

Cite this