Archaeologists often need to date and group artifact types to discern typologies, chronologies, and classifications. For over a century, statisticians have been using classification and clustering techniques to infer patterns in data that can be defined by algorithms. In the case of archaeology, linear regression algorithms are often used to chronologically date features and sites, and pattern recognition is used to develop typologies and classifications. However, archaeological data is often expensive to collect, and analyses are often limited by poor sample sizes and datasets. Here we show that recent advances in computation allow archaeologists to use machine learning based on much of the same statistical theory to address more complex problems using increased computing power and larger and incomplete datasets. This paper approaches the problem of predicting the chronology of archaeological sites through a case study of medieval temples in Angkor, Cambodia. For this study, we have a large dataset of temples with known architectural elements and artifacts; however, less than ten percent of the sample of temples have known dates, and much of the attribute data is incomplete. Our results suggest that the algorithms can predict dates for temples from 821–1150 CE with a 49-66-year average absolute error. We find that this method surpasses traditional supervised and unsupervised statistical approaches for under-specified portions of the dataset and is a promising new method for anthropological inquiry.
ASJC Scopus subject areas