TY - GEN
T1 - LAVA
T2 - 2016 IEEE Symposium on Security and Privacy, SP 2016
AU - Dolan-Gavitt, Brendan
AU - Hulin, Patrick
AU - Kirda, Engin
AU - Leek, Tim
AU - Mambretti, Andrea
AU - Robertson, Wil
AU - Ulrich, Frederick
AU - Whelan, Ryan
N1 - Publisher Copyright:
© 2016 IEEE.
PY - 2016/8/16
Y1 - 2016/8/16
N2 - Work on automating vulnerability discovery has long been hampered by a shortage of ground-truth corpora with which to evaluate tools and techniques. This lack of ground truth prevents authors and users of tools alike from being able to measure such fundamental quantities as miss and false alarm rates. In this paper, we present LAVA, a novel dynamic taint analysis-based technique for producing ground-truth corpora by quickly and automatically injecting large numbers of realistic bugs into program source code. Every LAVA bug is accompanied by an input that triggers it whereas normal inputs are extremely unlikely to do so. These vulnerabilities are synthetic but, we argue, still realistic, in the sense that they are embedded deep within programs and are triggered by real inputs. Using LAVA, we have injected thousands of bugs into eight real-world programs, including bash, tshark, and the GNU coreutils. In a preliminary evaluation, we found that a prominent fuzzer and a symbolic execution-based bug finder were able to locate some but not all LAVA-injected bugs, and that interesting patterns and pathologies were already apparent in their performance. Our work forms the basis of an approach for generating large ground-truth vulnerability corpora on demand, enabling rigorous tool evaluation and providing a high-quality target for tool developers.
AB - Work on automating vulnerability discovery has long been hampered by a shortage of ground-truth corpora with which to evaluate tools and techniques. This lack of ground truth prevents authors and users of tools alike from being able to measure such fundamental quantities as miss and false alarm rates. In this paper, we present LAVA, a novel dynamic taint analysis-based technique for producing ground-truth corpora by quickly and automatically injecting large numbers of realistic bugs into program source code. Every LAVA bug is accompanied by an input that triggers it whereas normal inputs are extremely unlikely to do so. These vulnerabilities are synthetic but, we argue, still realistic, in the sense that they are embedded deep within programs and are triggered by real inputs. Using LAVA, we have injected thousands of bugs into eight real-world programs, including bash, tshark, and the GNU coreutils. In a preliminary evaluation, we found that a prominent fuzzer and a symbolic execution-based bug finder were able to locate some but not all LAVA-injected bugs, and that interesting patterns and pathologies were already apparent in their performance. Our work forms the basis of an approach for generating large ground-truth vulnerability corpora on demand, enabling rigorous tool evaluation and providing a high-quality target for tool developers.
UR - http://www.scopus.com/inward/record.url?scp=84987615823&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84987615823&partnerID=8YFLogxK
U2 - 10.1109/SP.2016.15
DO - 10.1109/SP.2016.15
M3 - Conference contribution
AN - SCOPUS:84987615823
T3 - Proceedings - 2016 IEEE Symposium on Security and Privacy, SP 2016
SP - 110
EP - 121
BT - Proceedings - 2016 IEEE Symposium on Security and Privacy, SP 2016
PB - Institute of Electrical and Electronics Engineers Inc.
Y2 - 23 May 2016 through 25 May 2016
ER -