TY - JOUR
T1 - BugDoc
T2 - Iterative debugging and explanation of pipeline
AU - Lourenço, Raoni
AU - Freire, Juliana
AU - Simon, Eric
AU - Weber, Gabriel
AU - Shasha, Dennis
N1 - Funding Information:
We thank the Data X-Ray and Explanation Tables authors for sharing their code with us. We are also grateful to Fernando Chirigati, Neel Dey, and Peter Bailis for providing the real-world pipelines. This work has been supported in part by NSF grants IIS-1916505, IIS-2106888, IOS-1339362, MCB-1158273, MCB-1412232, and OAC-1934464; CNPq (Brazil) grant 209623/2014-4; the DARPA D3M program; and NYU WIRELESS. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the authors and do not necessarily reflect the views of funding agencies.
Publisher Copyright:
© 2022, The Author(s), under exclusive licence to Springer-Verlag GmbH Germany, part of Springer Nature.
PY - 2022
Y1 - 2022
N2 - Applications in domains ranging from large-scale simulations in astrophysics and biology to enterprise analytics rely on computational pipelines. A pipeline consists of modules and their associated parameters, data inputs, and outputs, which are orchestrated to produce a set of results. If some modules derive unexpected outputs, the pipeline can crash or lead to incorrect results. Debugging these pipelines is difficult since there are many potential sources of errors including: bugs in the code, input data, software updates, and improper parameter settings. We present BugDoc, a system that automatically infers the root causes and derive succinct explanations of failures for black-box pipelines. BugDoc does so by using provenance from previous runs of a given pipeline to derive hypotheses for the errors, and then iteratively runs new pipeline configurations to test these hypotheses. Besides identifying issues associated with computational modules in a pipeline, we also propose methods for: “opportunistic group testing” to identify portions of data inputs that might be responsible for failed executions (what we call), helping users narrow down the cause of failure; and “selective instrumentation” to determine nodes in pipelines that should be instrumented to improve efficiency and reduce the number of iterations to test. Through a case study of deployed workflows at a software company and an experimental evaluation using synthetic pipelines, we assess the effectiveness of BugDoc and show that it requires fewer iterations to derive root causes and/or achieves higher quality results than previous approaches.
AB - Applications in domains ranging from large-scale simulations in astrophysics and biology to enterprise analytics rely on computational pipelines. A pipeline consists of modules and their associated parameters, data inputs, and outputs, which are orchestrated to produce a set of results. If some modules derive unexpected outputs, the pipeline can crash or lead to incorrect results. Debugging these pipelines is difficult since there are many potential sources of errors including: bugs in the code, input data, software updates, and improper parameter settings. We present BugDoc, a system that automatically infers the root causes and derive succinct explanations of failures for black-box pipelines. BugDoc does so by using provenance from previous runs of a given pipeline to derive hypotheses for the errors, and then iteratively runs new pipeline configurations to test these hypotheses. Besides identifying issues associated with computational modules in a pipeline, we also propose methods for: “opportunistic group testing” to identify portions of data inputs that might be responsible for failed executions (what we call), helping users narrow down the cause of failure; and “selective instrumentation” to determine nodes in pipelines that should be instrumented to improve efficiency and reduce the number of iterations to test. Through a case study of deployed workflows at a software company and an experimental evaluation using synthetic pipelines, we assess the effectiveness of BugDoc and show that it requires fewer iterations to derive root causes and/or achieves higher quality results than previous approaches.
UR - http://www.scopus.com/inward/record.url?scp=85125067343&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85125067343&partnerID=8YFLogxK
U2 - 10.1007/s00778-022-00733-5
DO - 10.1007/s00778-022-00733-5
M3 - Article
AN - SCOPUS:85125067343
JO - VLDB Journal
JF - VLDB Journal
SN - 1066-8888
ER -