TY - JOUR

T1 - Investigation of the Case-based Reasoning Retrieval Process to Estimate Resources in Construction Projects

AU - Soto, Borja García De

AU - Adey, Bryan T.

N1 - Publisher Copyright:
© 2015 The Authors. Published by Elsevier Ltd.
Copyright:
Copyright 2016 Elsevier B.V., All rights reserved.

PY - 2015

Y1 - 2015

N2 - Case-based reasoning (CBR) is a methodology that is seeing increasing use to make predictions during the early phases of a project. It allows estimators to exploit existing knowledge to make predictions that are considerably better than without its use. All CBR, however, is not identical, and variations in how CBR is done can affect the accuracy of the predictions. One particular area of sensitivity is the retrieval phase, i.e. the way in which the CBR determines the closeness between the new and the existing cases. In this paper, CBR is used to make estimates of resources for construction projects, and the use of the nearest neighbor technique to identify the similarity for the retrieval phase to predict the construction material quantities (CMQs) in concrete structures is investigated. Two types of distances, i.e. 1) the City-block distance and 2) the Euclidean distance, and four different types of weights, based on regression analysis and feature counting, to account for the relative importance of the different parameters, are investigated. The four different types of weights used were 1) the adjusted unstandardized coefficients from the regression models, 2) the unadjusted unstandardized coefficients from the regression models, 3) the standardized coefficients from the regression models, and 4) equal weights (i.e.; feature counting), in which the weights applied are 1/k, and k is the number of parameter being compared to determine the distance. The mean absolute percentage error (MAPE) was used to evaluate each combination investigated. It was found that for a similarity threshold of 90%, the CBR methodology using the City-block distance with the adjusted unstandardized coefficients from the regression analysis models using the transformed (LN) dataset as weights, gave the best results, with a MAPE of 8.16%. The worst results were obtained from the CBR methodology using the Euclidean distance with feature counting weights, with a MAPE of 28.40%.

AB - Case-based reasoning (CBR) is a methodology that is seeing increasing use to make predictions during the early phases of a project. It allows estimators to exploit existing knowledge to make predictions that are considerably better than without its use. All CBR, however, is not identical, and variations in how CBR is done can affect the accuracy of the predictions. One particular area of sensitivity is the retrieval phase, i.e. the way in which the CBR determines the closeness between the new and the existing cases. In this paper, CBR is used to make estimates of resources for construction projects, and the use of the nearest neighbor technique to identify the similarity for the retrieval phase to predict the construction material quantities (CMQs) in concrete structures is investigated. Two types of distances, i.e. 1) the City-block distance and 2) the Euclidean distance, and four different types of weights, based on regression analysis and feature counting, to account for the relative importance of the different parameters, are investigated. The four different types of weights used were 1) the adjusted unstandardized coefficients from the regression models, 2) the unadjusted unstandardized coefficients from the regression models, 3) the standardized coefficients from the regression models, and 4) equal weights (i.e.; feature counting), in which the weights applied are 1/k, and k is the number of parameter being compared to determine the distance. The mean absolute percentage error (MAPE) was used to evaluate each combination investigated. It was found that for a similarity threshold of 90%, the CBR methodology using the City-block distance with the adjusted unstandardized coefficients from the regression analysis models using the transformed (LN) dataset as weights, gave the best results, with a MAPE of 8.16%. The worst results were obtained from the CBR methodology using the Euclidean distance with feature counting weights, with a MAPE of 28.40%.

KW - artificial intelligence

KW - case-based reasoning

KW - preliminary estimates

KW - resource estimates

KW - retrieval process

UR - http://www.scopus.com/inward/record.url?scp=84953282518&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84953282518&partnerID=8YFLogxK

U2 - 10.1016/j.proeng.2015.10.074

DO - 10.1016/j.proeng.2015.10.074

M3 - Conference article

AN - SCOPUS:84953282518

VL - 123

SP - 169

EP - 181

JO - Procedia Engineering

JF - Procedia Engineering

SN - 1877-7058

T2 - 4th Creative Construction Conference, CCC 2015

Y2 - 21 June 2015 through 24 June 2015

ER -