TY - GEN
T1 - On-Line Big-Data Processing for Visual Analytics with Argus-Panoptes
AU - Vlantis, Panayiotis I.
AU - Delis, Alex
N1 - Publisher Copyright:
© 2019, Springer Nature Switzerland AG.
Copyright:
Copyright 2019 Elsevier B.V., All rights reserved.
PY - 2019
Y1 - 2019
N2 - Analyses with data mining and knowledge discovery techniques are not always successful as they occasionally yield no actionable results. This is especially true in the Big-Data context where we routinely deal with complex, heterogeneous, diverse and rapidly changing data. In this context, visual analytics play a key role in helping both experts and users to readily comprehend and better manage analyses carried on data stored in Infrastructure as a Service (IaaS) cloud services. To this end, humans should play a critical role in continually ascertaining the value of the processed information and are invariably deemed to be the instigators of actionable tasks. The latter is facilitated with the assistance of sophisticated tools that let humans interface with the data through vision and interaction. When working with Big-Data problems, both scale and nature of data undoubtedly present a barrier in implementing responsive applications. In this paper, we propose a software architecture that seeks to empower Big-Data analysts with visual analytics tools atop large-scale data stored in and processed by IaaS. Our key goal is to not only yield on-line analytic processing but also provide the facilities for the users to effectively interact with the underlying IaaS machinery. Although we focus on hierarchical and spatiotemporal datasets here, our proposed architecture is general and can be used to a wide number of application domains. The core design principles of our approach are: (a) On-line processing on cloud with Apache Spark. (b) Integration of interactive programming following the notebook paradigm through Apache Zeppelin. (c) Offering robust operation when data and/or schema change on the fly. Through experimentation with a prototype of our suggested architecture, we demonstrate not only the viability of our approach but also we show its value in a use-case involving publicly available crime data from United Kingdom.
AB - Analyses with data mining and knowledge discovery techniques are not always successful as they occasionally yield no actionable results. This is especially true in the Big-Data context where we routinely deal with complex, heterogeneous, diverse and rapidly changing data. In this context, visual analytics play a key role in helping both experts and users to readily comprehend and better manage analyses carried on data stored in Infrastructure as a Service (IaaS) cloud services. To this end, humans should play a critical role in continually ascertaining the value of the processed information and are invariably deemed to be the instigators of actionable tasks. The latter is facilitated with the assistance of sophisticated tools that let humans interface with the data through vision and interaction. When working with Big-Data problems, both scale and nature of data undoubtedly present a barrier in implementing responsive applications. In this paper, we propose a software architecture that seeks to empower Big-Data analysts with visual analytics tools atop large-scale data stored in and processed by IaaS. Our key goal is to not only yield on-line analytic processing but also provide the facilities for the users to effectively interact with the underlying IaaS machinery. Although we focus on hierarchical and spatiotemporal datasets here, our proposed architecture is general and can be used to a wide number of application domains. The core design principles of our approach are: (a) On-line processing on cloud with Apache Spark. (b) Integration of interactive programming following the notebook paradigm through Apache Zeppelin. (c) Offering robust operation when data and/or schema change on the fly. Through experimentation with a prototype of our suggested architecture, we demonstrate not only the viability of our approach but also we show its value in a use-case involving publicly available crime data from United Kingdom.
KW - Apache Spark
KW - Big-Data processing
KW - IaaS Infrastructures
KW - Interactive programming
KW - Visual analytics
UR - http://www.scopus.com/inward/record.url?scp=85065794362&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85065794362&partnerID=8YFLogxK
U2 - 10.1007/978-3-030-19759-9_7
DO - 10.1007/978-3-030-19759-9_7
M3 - Conference contribution
AN - SCOPUS:85065794362
SN - 9783030197582
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 102
EP - 117
BT - Algorithmic Aspects of Cloud Computing - 4th International Symposium, ALGOCLOUD 2018, Revised Selected Papers
A2 - Disser, Yann
A2 - Verykios, Vassilios S.
PB - Springer Verlag
T2 - 4th International Symposium on Algorithmic Aspects of Cloud Computing, ALGOCLOUD 2018
Y2 - 20 August 2018 through 21 August 2018
ER -