TY - GEN
T1 - Towards integrating workflow and database provenance
AU - Chirigati, Fernando
AU - Freire, Juliana
PY - 2012
Y1 - 2012
N2 - While there has been substantial work on both database and workflow provenance, the two problems have only been examined in isolation. It is widely accepted that the existing models are incompatible. Database provenance is fine-grained and captures changes to tuples in a database. In contrast, workflow provenance is represented at a coarser level and reflects the functional model of workflow systems, which is stateless-each computational step derives a new artifact. In this paper, we propose a new approach to combine database and workflow provenance. We address the mismatch between the different kinds of provenance by using a temporal model which explicitly represents the database states as updates are applied. We discuss how, under this model, reproducibility is obtained for workflows that manipulate databases, and how different queries that straddle the two provenance traces can be evaluated. We also describe a proof-of-concept implementation that integrates a workflow system and a commercial relational database.
AB - While there has been substantial work on both database and workflow provenance, the two problems have only been examined in isolation. It is widely accepted that the existing models are incompatible. Database provenance is fine-grained and captures changes to tuples in a database. In contrast, workflow provenance is represented at a coarser level and reflects the functional model of workflow systems, which is stateless-each computational step derives a new artifact. In this paper, we propose a new approach to combine database and workflow provenance. We address the mismatch between the different kinds of provenance by using a temporal model which explicitly represents the database states as updates are applied. We discuss how, under this model, reproducibility is obtained for workflows that manipulate databases, and how different queries that straddle the two provenance traces can be evaluated. We also describe a proof-of-concept implementation that integrates a workflow system and a commercial relational database.
KW - Database Provenance
KW - Reproducibility
KW - Workflow Provenance
UR - http://www.scopus.com/inward/record.url?scp=84868294169&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84868294169&partnerID=8YFLogxK
U2 - 10.1007/978-3-642-34222-6_2
DO - 10.1007/978-3-642-34222-6_2
M3 - Conference contribution
AN - SCOPUS:84868294169
SN - 9783642342219
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 11
EP - 23
BT - Provenance and Annotation of Data and Processes - 4th International Provenance and Annotation Workshop, IPAW 2012, Revised Selected Papers
T2 - 4th International Provenance and Annotation Workshop, IPAW 2012
Y2 - 19 June 2012 through 21 June 2012
ER -