TY - GEN
T1 - Lessons learned with laser scanning point cloud management in Hadoop HBase
AU - Vo, Anh Vu
AU - Konda, Nikita
AU - Chauhan, Neel
AU - Aljumaily, Harith
AU - Laefer, Debra F.
N1 - Funding Information:
The Hadoop cluster used for the work presented in this paper was provided by allocation TG-CIE170036-Extreme Science and Engineering Discovery Environment (XSEDE), which is supported by National Science Foundation grant number ACI-1548562 [30]. The authors would like to thank the staff at Pittsburg Supercomputing Center for the truly outstanding technical support provided during setting up the testing. This research also made use of data collected with funding from the European Research Council grant ERC-2012-StG 20111012 “RETURN-Rethinking Tunnelling in Urban Neighbourhoods” Project 307836. The dataset is available from NYU Spatial Data Repository https://doi.org/10.17609/N8MQ0N.
Funding Information:
Acknowledgments. The Hadoop cluster used for the work presented in this paper was provided by allocation TG-CIE170036 - Extreme Science and Engineering Discovery Environment (XSEDE), which is supported by National Science Foundation grant number ACI-1548562 [30]. The authors would like to thank the staff at Pittsburg Supercomputing Center for the truly outstanding technical support provided during setting up the testing. This research also made use of data collected with funding from the European Research Council grant ERC-2012-StG 20111012 “RETURN - Rethinking Tunnelling in Urban Neighbourhoods” Project 307836.
Publisher Copyright:
© Springer International Publishing AG, part of Springer Nature 2018.
PY - 2018
Y1 - 2018
N2 - While big data technologies are growing rapidly and benefit a wide range of science and engineering domains, many barriers remain for the remote sensing community to fully exploit the benefits provided by these powerful and rapidly developing technologies. To overcome existing barriers, this paper presents the in-depth experience gained when adopting a distributed computing framework – Hadoop HBase – for storage, indexing, and integration of large scale, high resolution laser scanning point cloud data. Four data models were conceptualized, implemented, and rigorously investigated to explore the advantageous features of distributed, key-value database systems. In addition, the comparison of the four models facilitated the reassessment of several well-known point cloud management techniques founded in traditional computing environments in the new context of a distributed, key-value database. The four models were derived from two row-key designs and two columns structures, thereby demonstrating various considerations during the development of a data solution for high-resolution, city-scale aerial laser scan for a portion of Dublin, Ireland. This paper presents lessons learned from the data model design and its implementation for spatial data management in a distributed computing framework. The study is a step towards full exploitation of powerful emerging computing assets for dense spatio-temporal data.
AB - While big data technologies are growing rapidly and benefit a wide range of science and engineering domains, many barriers remain for the remote sensing community to fully exploit the benefits provided by these powerful and rapidly developing technologies. To overcome existing barriers, this paper presents the in-depth experience gained when adopting a distributed computing framework – Hadoop HBase – for storage, indexing, and integration of large scale, high resolution laser scanning point cloud data. Four data models were conceptualized, implemented, and rigorously investigated to explore the advantageous features of distributed, key-value database systems. In addition, the comparison of the four models facilitated the reassessment of several well-known point cloud management techniques founded in traditional computing environments in the new context of a distributed, key-value database. The four models were derived from two row-key designs and two columns structures, thereby demonstrating various considerations during the development of a data solution for high-resolution, city-scale aerial laser scan for a portion of Dublin, Ireland. This paper presents lessons learned from the data model design and its implementation for spatial data management in a distributed computing framework. The study is a step towards full exploitation of powerful emerging computing assets for dense spatio-temporal data.
KW - Big data
KW - Distributed database
KW - HBase
KW - Hadoop
KW - LiDAR
KW - Point cloud
KW - Spatial data management
UR - http://www.scopus.com/inward/record.url?scp=85049075524&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85049075524&partnerID=8YFLogxK
U2 - 10.1007/978-3-319-91635-4_13
DO - 10.1007/978-3-319-91635-4_13
M3 - Conference contribution
AN - SCOPUS:85049075524
SN - 9783319916347
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 231
EP - 253
BT - Advanced Computing Strategies for Engineering - 25th EG-ICE International Workshop 2018, Proceedings
A2 - Domer, Bernd
A2 - Smith, Ian F.
PB - Springer Verlag
T2 - 25th Workshop of the European Group for Intelligent Computing in Engineering, EG-ICE 2018
Y2 - 10 June 2018 through 13 June 2018
ER -