Towards integrating workflow and database provenance

Fernando Chirigati, Juliana Freire

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

While there has been substantial work on both database and workflow provenance, the two problems have only been examined in isolation. It is widely accepted that the existing models are incompatible. Database provenance is fine-grained and captures changes to tuples in a database. In contrast, workflow provenance is represented at a coarser level and reflects the functional model of workflow systems, which is stateless-each computational step derives a new artifact. In this paper, we propose a new approach to combine database and workflow provenance. We address the mismatch between the different kinds of provenance by using a temporal model which explicitly represents the database states as updates are applied. We discuss how, under this model, reproducibility is obtained for workflows that manipulate databases, and how different queries that straddle the two provenance traces can be evaluated. We also describe a proof-of-concept implementation that integrates a workflow system and a commercial relational database.

Original languageEnglish (US)
Title of host publicationProvenance and Annotation of Data and Processes - 4th International Provenance and Annotation Workshop, IPAW 2012, Revised Selected Papers
Pages11-23
Number of pages13
DOIs
StatePublished - 2012
Event4th International Provenance and Annotation Workshop, IPAW 2012 - Santa Barbara, CA, United States
Duration: Jun 19 2012Jun 21 2012

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume7525 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Other

Other4th International Provenance and Annotation Workshop, IPAW 2012
CountryUnited States
CitySanta Barbara, CA
Period6/19/126/21/12

Keywords

  • Database Provenance
  • Reproducibility
  • Workflow Provenance

ASJC Scopus subject areas

  • Theoretical Computer Science
  • Computer Science(all)

Fingerprint Dive into the research topics of 'Towards integrating workflow and database provenance'. Together they form a unique fingerprint.

Cite this