TY - JOUR
T1 - Height Estimation from Single Aerial Images Using a Deep Ordinal Regression Network
AU - Li, Xiang
AU - Wang, Mingyang
AU - Fang, Yi
N1 - Funding Information:
This work was supported by the Ecological Quality Meteorological Monitoring and Evaluation, Mountain Flood Geological Disaster Prevention Meteorological Guarantee Project 2020, Zhejiang Province Climate Center.
Publisher Copyright:
© 2004-2012 IEEE.
PY - 2022
Y1 - 2022
N2 - Understanding the 3-D geometric structure of the Earth's surface has been an active research topic in photogrammetry and remote sensing community for decades, serving as an essential building block for various applications such as 3-D digital city modeling, change detection, and city management. Previous research studies have extensively studied the problem of height estimation from aerial images based on stereo or multiview image matching. These methods require two or more images from different perspectives to reconstruct 3-D coordinates with camera information provided. In this letter, we deal with the ambiguous and unsolved problem of height estimation from a single aerial image. Driven by the great success of deep learning, especially deep convolutional neural networks (CNNs), some research studies have proposed to estimate height information from a single aerial image by training a deep CNN model with large-scale annotated data sets. These methods treat height estimation as a regression problem and directly use an encoder-decoder network to regress the height values. In this letter, we propose to divide height values into spacing-increasing intervals and transform the regression problem into an ordinal regression problem, using an ordinal loss for network training. To enable multiscale feature extraction, we further incorporate an Atrous Spatial Pyramid Pooling (ASPP) module to extract features from multiple dilated convolution layers. After that, a postprocessing technique is designed to transform the predicted height map of each patch into a seamless height map. Finally, we conduct extensive experiments on International Society for Photogrammetry and Remote Sensing (ISPRS) Vaihingen and Potsdam data sets. Experimental results demonstrate significantly better performance of our method compared to state-of-the-art methods.
AB - Understanding the 3-D geometric structure of the Earth's surface has been an active research topic in photogrammetry and remote sensing community for decades, serving as an essential building block for various applications such as 3-D digital city modeling, change detection, and city management. Previous research studies have extensively studied the problem of height estimation from aerial images based on stereo or multiview image matching. These methods require two or more images from different perspectives to reconstruct 3-D coordinates with camera information provided. In this letter, we deal with the ambiguous and unsolved problem of height estimation from a single aerial image. Driven by the great success of deep learning, especially deep convolutional neural networks (CNNs), some research studies have proposed to estimate height information from a single aerial image by training a deep CNN model with large-scale annotated data sets. These methods treat height estimation as a regression problem and directly use an encoder-decoder network to regress the height values. In this letter, we propose to divide height values into spacing-increasing intervals and transform the regression problem into an ordinal regression problem, using an ordinal loss for network training. To enable multiscale feature extraction, we further incorporate an Atrous Spatial Pyramid Pooling (ASPP) module to extract features from multiple dilated convolution layers. After that, a postprocessing technique is designed to transform the predicted height map of each patch into a seamless height map. Finally, we conduct extensive experiments on International Society for Photogrammetry and Remote Sensing (ISPRS) Vaihingen and Potsdam data sets. Experimental results demonstrate significantly better performance of our method compared to state-of-the-art methods.
KW - Aerial image
KW - convolutional neural networks (CNNs)
KW - digital surface model (DSM)
KW - height estimation
KW - ordinal regression
UR - http://www.scopus.com/inward/record.url?scp=85094557009&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85094557009&partnerID=8YFLogxK
U2 - 10.1109/LGRS.2020.3019252
DO - 10.1109/LGRS.2020.3019252
M3 - Article
AN - SCOPUS:85094557009
SN - 1545-598X
VL - 19
JO - IEEE Geoscience and Remote Sensing Letters
JF - IEEE Geoscience and Remote Sensing Letters
ER -