A dataset of simplified syntax trees for C#

Sebastian Proksch, Sven Amann, Sarah Nadi, Mira Mezini

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

In this paper, we present a curated collection of 2833 C# solutions taken from Github. We encode the data in a new intermediate representation (IR) that facilitates further analysis by restricting the complexity of the syntax tree and by avoiding implicit information. The dataset is intended as a standardized input for research on recommendation systems for software engineering, but is also useful in many other areas that analyze source code.

Original languageEnglish (US)
Title of host publicationProceedings - 13th Working Conference on Mining Software Repositories, MSR 2016
PublisherAssociation for Computing Machinery, Inc
Pages476-479
Number of pages4
ISBN (Electronic)9781450341868
DOIs
StatePublished - May 14 2016
Event13th Working Conference on Mining Software Repositories, MSR 2016 - Austin, United States
Duration: May 14 2016May 15 2016

Publication series

NameProceedings - 13th Working Conference on Mining Software Repositories, MSR 2016

Conference

Conference13th Working Conference on Mining Software Repositories, MSR 2016
Country/TerritoryUnited States
CityAustin
Period5/14/165/15/16

ASJC Scopus subject areas

  • Software
  • Information Systems and Management

Fingerprint

Dive into the research topics of 'A dataset of simplified syntax trees for C#'. Together they form a unique fingerprint.

Cite this