A data-driven predictive model of city-scale energy use in buildings

Constantine E. Kontokosta, Christopher Tull

Research output: Contribution to journalArticle

Abstract

Many cities across the United States have turned to building energy disclosure (or benchmarking) laws to encourage transparency in energy efficiency markets and to support sustainability and carbon reduction plans. In addition to direct peer-to-peer comparisons, the benchmarking data published under these laws have been used as a tool by researchers and policy-makers to study the distribution and determinants of energy use in large buildings. However, these policies only cover a small subset of the building stock in a given city, and thus capture only a fraction of energy use at the urban scale. To overcome this limitation, we develop a predictive model of energy use at the building, district, and city scales using training data from energy disclosure policies and predictors from widely-available property and zoning information. We use statistical models to predict the energy use of 1.1 million buildings in New York City using the physical, spatial, and energy use attributes of a subset derived from 23,000 buildings required to report energy use data each year. Linear regression (OLS), random forest, and support vector regression (SVM) algorithms are fit to the city's energy benchmarking data and then used to predict electricity and natural gas use for every property in the city. Model accuracy is assessed and validated at the building level and zip code level using actual consumption data from calendar year 2014. We find the OLS model performs best when generalizing to the City as a whole, and SVM results in the lowest mean absolute error for predicting energy use within the LL84 sample. Our median predicted electric energy use intensity for office buildings is 71.2 kbtu/sf and for residential buildings is 31.2 kbtu/sf with mean absolute log accuracy ratio of 0.17. Building age is found to be a significant predictor of energy use, with newer buildings (particularly those built since 1991) found to have higher consumption levels than those constructed before 1930. We also find higher electric consumption in office and retail buildings, although the sign is reversed for natural gas. In general, larger buildings use less energy per square foot, while taller buildings with more stories, controlling for floor area, use more energy per square foot. Attached buildings – those with adjacent buildings and a shared party wall – are found to have lower natural gas use intensity. The results demonstrate that electricity consumption can be reliably predicted using actual data from a relatively small subset of buildings, while natural gas use presents a more complicated problem given the bimodal distribution of consumption and infrastructure availability.

Original languageEnglish (US)
Pages (from-to)303-317
Number of pages15
JournalApplied Energy
Volume197
DOIs
StatePublished - Jul 1 2017

Fingerprint

energy use
Natural gas
Benchmarking
natural gas
Electricity
benchmarking
electricity
energy
Zoning
Tall buildings
Office buildings
Energy policy
Linear regression
Transparency
Energy efficiency
Sustainable development
Availability
Carbon
energy policy
energy efficiency

Keywords

  • Building energy
  • Energy efficiency
  • Energy prediction
  • Machine learning
  • Urban dynamics

ASJC Scopus subject areas

  • Civil and Structural Engineering
  • Energy(all)

Cite this

A data-driven predictive model of city-scale energy use in buildings. / Kontokosta, Constantine E.; Tull, Christopher.

In: Applied Energy, Vol. 197, 01.07.2017, p. 303-317.

Research output: Contribution to journalArticle

Kontokosta, Constantine E.; Tull, Christopher / A data-driven predictive model of city-scale energy use in buildings.

In: Applied Energy, Vol. 197, 01.07.2017, p. 303-317.

Research output: Contribution to journalArticle

@article{f341501513a649ce813ca69196553e8b,
title = "A data-driven predictive model of city-scale energy use in buildings",
abstract = "Many cities across the United States have turned to building energy disclosure (or benchmarking) laws to encourage transparency in energy efficiency markets and to support sustainability and carbon reduction plans. In addition to direct peer-to-peer comparisons, the benchmarking data published under these laws have been used as a tool by researchers and policy-makers to study the distribution and determinants of energy use in large buildings. However, these policies only cover a small subset of the building stock in a given city, and thus capture only a fraction of energy use at the urban scale. To overcome this limitation, we develop a predictive model of energy use at the building, district, and city scales using training data from energy disclosure policies and predictors from widely-available property and zoning information. We use statistical models to predict the energy use of 1.1 million buildings in New York City using the physical, spatial, and energy use attributes of a subset derived from 23,000 buildings required to report energy use data each year. Linear regression (OLS), random forest, and support vector regression (SVM) algorithms are fit to the city's energy benchmarking data and then used to predict electricity and natural gas use for every property in the city. Model accuracy is assessed and validated at the building level and zip code level using actual consumption data from calendar year 2014. We find the OLS model performs best when generalizing to the City as a whole, and SVM results in the lowest mean absolute error for predicting energy use within the LL84 sample. Our median predicted electric energy use intensity for office buildings is 71.2 kbtu/sf and for residential buildings is 31.2 kbtu/sf with mean absolute log accuracy ratio of 0.17. Building age is found to be a significant predictor of energy use, with newer buildings (particularly those built since 1991) found to have higher consumption levels than those constructed before 1930. We also find higher electric consumption in office and retail buildings, although the sign is reversed for natural gas. In general, larger buildings use less energy per square foot, while taller buildings with more stories, controlling for floor area, use more energy per square foot. Attached buildings – those with adjacent buildings and a shared party wall – are found to have lower natural gas use intensity. The results demonstrate that electricity consumption can be reliably predicted using actual data from a relatively small subset of buildings, while natural gas use presents a more complicated problem given the bimodal distribution of consumption and infrastructure availability.",
keywords = "Building energy, Energy efficiency, Energy prediction, Machine learning, Urban dynamics",
author = "Kontokosta, {Constantine E.} and Christopher Tull",
year = "2017",
month = "7",
doi = "10.1016/j.apenergy.2017.04.005",
volume = "197",
pages = "303--317",
journal = "Applied Energy",
issn = "0306-2619",
publisher = "Elsevier BV",

}

TY - JOUR

T1 - A data-driven predictive model of city-scale energy use in buildings

AU - Kontokosta,Constantine E.

AU - Tull,Christopher

PY - 2017/7/1

Y1 - 2017/7/1

N2 - Many cities across the United States have turned to building energy disclosure (or benchmarking) laws to encourage transparency in energy efficiency markets and to support sustainability and carbon reduction plans. In addition to direct peer-to-peer comparisons, the benchmarking data published under these laws have been used as a tool by researchers and policy-makers to study the distribution and determinants of energy use in large buildings. However, these policies only cover a small subset of the building stock in a given city, and thus capture only a fraction of energy use at the urban scale. To overcome this limitation, we develop a predictive model of energy use at the building, district, and city scales using training data from energy disclosure policies and predictors from widely-available property and zoning information. We use statistical models to predict the energy use of 1.1 million buildings in New York City using the physical, spatial, and energy use attributes of a subset derived from 23,000 buildings required to report energy use data each year. Linear regression (OLS), random forest, and support vector regression (SVM) algorithms are fit to the city's energy benchmarking data and then used to predict electricity and natural gas use for every property in the city. Model accuracy is assessed and validated at the building level and zip code level using actual consumption data from calendar year 2014. We find the OLS model performs best when generalizing to the City as a whole, and SVM results in the lowest mean absolute error for predicting energy use within the LL84 sample. Our median predicted electric energy use intensity for office buildings is 71.2 kbtu/sf and for residential buildings is 31.2 kbtu/sf with mean absolute log accuracy ratio of 0.17. Building age is found to be a significant predictor of energy use, with newer buildings (particularly those built since 1991) found to have higher consumption levels than those constructed before 1930. We also find higher electric consumption in office and retail buildings, although the sign is reversed for natural gas. In general, larger buildings use less energy per square foot, while taller buildings with more stories, controlling for floor area, use more energy per square foot. Attached buildings – those with adjacent buildings and a shared party wall – are found to have lower natural gas use intensity. The results demonstrate that electricity consumption can be reliably predicted using actual data from a relatively small subset of buildings, while natural gas use presents a more complicated problem given the bimodal distribution of consumption and infrastructure availability.

AB - Many cities across the United States have turned to building energy disclosure (or benchmarking) laws to encourage transparency in energy efficiency markets and to support sustainability and carbon reduction plans. In addition to direct peer-to-peer comparisons, the benchmarking data published under these laws have been used as a tool by researchers and policy-makers to study the distribution and determinants of energy use in large buildings. However, these policies only cover a small subset of the building stock in a given city, and thus capture only a fraction of energy use at the urban scale. To overcome this limitation, we develop a predictive model of energy use at the building, district, and city scales using training data from energy disclosure policies and predictors from widely-available property and zoning information. We use statistical models to predict the energy use of 1.1 million buildings in New York City using the physical, spatial, and energy use attributes of a subset derived from 23,000 buildings required to report energy use data each year. Linear regression (OLS), random forest, and support vector regression (SVM) algorithms are fit to the city's energy benchmarking data and then used to predict electricity and natural gas use for every property in the city. Model accuracy is assessed and validated at the building level and zip code level using actual consumption data from calendar year 2014. We find the OLS model performs best when generalizing to the City as a whole, and SVM results in the lowest mean absolute error for predicting energy use within the LL84 sample. Our median predicted electric energy use intensity for office buildings is 71.2 kbtu/sf and for residential buildings is 31.2 kbtu/sf with mean absolute log accuracy ratio of 0.17. Building age is found to be a significant predictor of energy use, with newer buildings (particularly those built since 1991) found to have higher consumption levels than those constructed before 1930. We also find higher electric consumption in office and retail buildings, although the sign is reversed for natural gas. In general, larger buildings use less energy per square foot, while taller buildings with more stories, controlling for floor area, use more energy per square foot. Attached buildings – those with adjacent buildings and a shared party wall – are found to have lower natural gas use intensity. The results demonstrate that electricity consumption can be reliably predicted using actual data from a relatively small subset of buildings, while natural gas use presents a more complicated problem given the bimodal distribution of consumption and infrastructure availability.

KW - Building energy

KW - Energy efficiency

KW - Energy prediction

KW - Machine learning

KW - Urban dynamics

UR - http://www.scopus.com/inward/record.url?scp=85018463769&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85018463769&partnerID=8YFLogxK

U2 - 10.1016/j.apenergy.2017.04.005

DO - 10.1016/j.apenergy.2017.04.005

M3 - Article

VL - 197

SP - 303

EP - 317

JO - Applied Energy

T2 - Applied Energy

JF - Applied Energy

SN - 0306-2619

ER -