TY - JOUR
T1 - A Study of Call Graph Construction for JVM-Hosted Languages
AU - Ali, Karim
AU - Lai, Xiaoni
AU - Luo, Zhaoyi
AU - Lhotak, Ondrej
AU - Dolby, Julian
AU - Tip, Frank
N1 - Publisher Copyright:
© 1976-2012 IEEE.
PY - 2021/12/1
Y1 - 2021/12/1
N2 - Call graphs have many applications in software engineering, including bug-finding, security analysis, and code navigation in IDEs. However, the construction of call graphs requires significant investment in program analysis infrastructure. An increasing number of programming languages compile to the Java Virtual Machine (JVM), and program analysis frameworks such as WALA and SOOT support a broad range of program analysis algorithms by analyzing JVM bytecode. This approach has been shown to work well when applied to bytecode produced from Java code. In this paper, we show that it also works well for diverse other JVM-hosted languages: dynamically-typed functional Scheme, statically-typed object-oriented Scala, and polymorphic functional OCaml. Effectively, we get call graph construction for these languages for free, using existing analysis infrastructure for Java, with only minor challenges to soundness. This, in turn, suggests that bytecode-based analysis could serve as an implementation vehicle for bug-finding, security analysis, and IDE features for these languages. We present qualitative and quantitative analyses of the soundness and precision of call graphs constructed from JVM bytecodes for these languages, and also for Groovy, Clojure, Python, and Ruby. However, we also show that implementation details matter greatly. In particular, the JVM-hosted implementations of Groovy, Clojure, Python, and Ruby produce very unsound call graphs, due to the pervasive use of reflection, invokedynamic instructions, and run-time code generation. Interestingly, the dynamic translation schemes employed by these languages, which result in unsound static call graphs, tend to be correlated with poor performance at run time.
AB - Call graphs have many applications in software engineering, including bug-finding, security analysis, and code navigation in IDEs. However, the construction of call graphs requires significant investment in program analysis infrastructure. An increasing number of programming languages compile to the Java Virtual Machine (JVM), and program analysis frameworks such as WALA and SOOT support a broad range of program analysis algorithms by analyzing JVM bytecode. This approach has been shown to work well when applied to bytecode produced from Java code. In this paper, we show that it also works well for diverse other JVM-hosted languages: dynamically-typed functional Scheme, statically-typed object-oriented Scala, and polymorphic functional OCaml. Effectively, we get call graph construction for these languages for free, using existing analysis infrastructure for Java, with only minor challenges to soundness. This, in turn, suggests that bytecode-based analysis could serve as an implementation vehicle for bug-finding, security analysis, and IDE features for these languages. We present qualitative and quantitative analyses of the soundness and precision of call graphs constructed from JVM bytecodes for these languages, and also for Groovy, Clojure, Python, and Ruby. However, we also show that implementation details matter greatly. In particular, the JVM-hosted implementations of Groovy, Clojure, Python, and Ruby produce very unsound call graphs, due to the pervasive use of reflection, invokedynamic instructions, and run-time code generation. Interestingly, the dynamic translation schemes employed by these languages, which result in unsound static call graphs, tend to be correlated with poor performance at run time.
KW - Call graphs
KW - Clojure
KW - Groovy
KW - JVM
KW - OCaml
KW - Python
KW - Ruby
KW - Scala
KW - Scheme
KW - compilation
KW - static analysis
UR - http://www.scopus.com/inward/record.url?scp=85121686568&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85121686568&partnerID=8YFLogxK
U2 - 10.1109/TSE.2019.2956925
DO - 10.1109/TSE.2019.2956925
M3 - Article
AN - SCOPUS:85121686568
SN - 0098-5589
VL - 47
SP - 2644
EP - 2666
JO - IEEE Transactions on Software Engineering
JF - IEEE Transactions on Software Engineering
IS - 12
ER -