TY - GEN
T1 - BugDoc
T2 - 2020 ACM SIGMOD International Conference on Management of Data, SIGMOD 2020
AU - Lourenço, Raoni
AU - Freire, Juliana
AU - Shasha, Dennis
N1 - Publisher Copyright:
© 2020 Association for Computing Machinery.
PY - 2020/6/14
Y1 - 2020/6/14
N2 - Data analysis for scientific experiments and enterprises, large-scale simulations, and machine learning tasks all entail the use of complex computational pipelines to reach quantitative and qualitative conclusions. If some of the activities in a pipeline produce erroneous outputs, the pipeline may fail to execute or produce incorrect results. Inferring the root cause(s) of such failures is challenging, usually requiring time and much human thought, while still being error-prone. We propose a new approach that makes use of iteration and provenance to automatically infer the root causes and derive succinct explanations of failures. Through a detailed experimental evaluation, we assess the cost, precision, and recall of our approach compared to the state of the art. Our experimental data and processing software is available for use, reproducibility, and enhancement.
AB - Data analysis for scientific experiments and enterprises, large-scale simulations, and machine learning tasks all entail the use of complex computational pipelines to reach quantitative and qualitative conclusions. If some of the activities in a pipeline produce erroneous outputs, the pipeline may fail to execute or produce incorrect results. Inferring the root cause(s) of such failures is challenging, usually requiring time and much human thought, while still being error-prone. We propose a new approach that makes use of iteration and provenance to automatically infer the root causes and derive succinct explanations of failures. Through a detailed experimental evaluation, we assess the cost, precision, and recall of our approach compared to the state of the art. Our experimental data and processing software is available for use, reproducibility, and enhancement.
KW - provenance
KW - workflow debugging
UR - http://www.scopus.com/inward/record.url?scp=85086247889&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85086247889&partnerID=8YFLogxK
U2 - 10.1145/3318464.3389763
DO - 10.1145/3318464.3389763
M3 - Conference contribution
AN - SCOPUS:85086247889
T3 - Proceedings of the ACM SIGMOD International Conference on Management of Data
SP - 463
EP - 478
BT - SIGMOD 2020 - Proceedings of the 2020 ACM SIGMOD International Conference on Management of Data
PB - Association for Computing Machinery
Y2 - 14 June 2020 through 19 June 2020
ER -