TY - GEN
T1 - Investigating order information in API-usage patterns
T2 - 13th International Conference on Software Technologies, ICSOFT 2018
AU - Çergani, Ervina
AU - Proksch, Sebastian
AU - Nadi, Sarah
AU - Mezini, Mira
N1 - Publisher Copyright:
Copyright © 2018 by SCITEPRESS - Science and Technology Publications, Lda. All rights reserved
PY - 2019
Y1 - 2019
N2 - Many approaches have been proposed for learning Application Programming Interface (API) usage patterns from code repositories. Depending on the underlying technique, the mined patterns may (1) be strictly sequential, (2) consider partial order between method calls, or (3) not consider order information. Understanding the trade-offs between these pattern types with respect to real code is important in many applications (e.g. code recommendation or misuse detection). In this work, we present a benchmark consisting of an episode mining algorithm that can be configured to learn all three types of patterns mentioned above. Running our benchmark on an existing dataset of 360 C# code repositories, we empirically study the resulting API usage patterns per pattern type. Our results show practical evidence that not only do partial-order patterns represent a generalized super set of sequential-order patterns, partial-order mining also finds additional patterns missed by sequence mining, which are used by a larger number of developers across code repositories. Additionally, our study empirically quantifies the importance of the order information encoded in sequential and partial-order patterns for representing correct co-occurrences of code elements in real code. Furthermore, our benchmark can be used by other researchers to explore additional properties of API patterns.
AB - Many approaches have been proposed for learning Application Programming Interface (API) usage patterns from code repositories. Depending on the underlying technique, the mined patterns may (1) be strictly sequential, (2) consider partial order between method calls, or (3) not consider order information. Understanding the trade-offs between these pattern types with respect to real code is important in many applications (e.g. code recommendation or misuse detection). In this work, we present a benchmark consisting of an episode mining algorithm that can be configured to learn all three types of patterns mentioned above. Running our benchmark on an existing dataset of 360 C# code repositories, we empirically study the resulting API usage patterns per pattern type. Our results show practical evidence that not only do partial-order patterns represent a generalized super set of sequential-order patterns, partial-order mining also finds additional patterns missed by sequence mining, which are used by a larger number of developers across code repositories. Additionally, our study empirically quantifies the importance of the order information encoded in sequential and partial-order patterns for representing correct co-occurrences of code elements in real code. Furthermore, our benchmark can be used by other researchers to explore additional properties of API patterns.
KW - API Usage Pattern Types
KW - Benchmark
KW - Code Repositories
KW - Empirical Evaluation
KW - Events Mining
UR - http://www.scopus.com/inward/record.url?scp=85071441294&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85071441294&partnerID=8YFLogxK
U2 - 10.5220/0006839000570068
DO - 10.5220/0006839000570068
M3 - Conference contribution
AN - SCOPUS:85071441294
T3 - ICSOFT 2018 - Proceedings of the 13th International Conference on Software Technologies
SP - 57
EP - 68
BT - ICSOFT 2018 - Proceedings of the 13th International Conference on Software Technologies
A2 - Maciaszek, Leszek
A2 - Maciaszek, Leszek
A2 - van Sinderen, Marten
PB - SciTePress
Y2 - 26 July 2018 through 28 July 2018
ER -