TY - GEN
T1 - An urban data profiler
AU - Ribeiro, Daniel Castellani
AU - Vo, Huy T.
AU - Freire, Juliana
AU - Silva, Cláudio T.
PY - 2015/5/18
Y1 - 2015/5/18
N2 - Large volumes of urban data are being made available through a variety of open portals. Besides promoting transparency, these data can bring benefits to government, science, citizens and industry. It is no longer a fantasy to ask "if you could know anything about a city, what do you want to know" and to ponder what could be done with that information. However, the great number and variety of datasets creates a new challenge: how to find relevant datasets. While existing portals provide search interfaces, these are often limited to keyword searches over the limited metadata associated each dataset, for example, attribute names and textual description. In this paper, we present a new tool, UrbanProfiler, that automatically extracts detailed information from datasets. This information includes attribute types, value distributions, and geographical information, which can be used to support complex search queries as well as visualizations that help users explore and obtain insight into the contents of a data collection. Besides describing the tool and its implementation, we present case studies that illustrate how the tool was used to explore a large open urban data repository.
AB - Large volumes of urban data are being made available through a variety of open portals. Besides promoting transparency, these data can bring benefits to government, science, citizens and industry. It is no longer a fantasy to ask "if you could know anything about a city, what do you want to know" and to ponder what could be done with that information. However, the great number and variety of datasets creates a new challenge: how to find relevant datasets. While existing portals provide search interfaces, these are often limited to keyword searches over the limited metadata associated each dataset, for example, attribute names and textual description. In this paper, we present a new tool, UrbanProfiler, that automatically extracts detailed information from datasets. This information includes attribute types, value distributions, and geographical information, which can be used to support complex search queries as well as visualizations that help users explore and obtain insight into the contents of a data collection. Besides describing the tool and its implementation, we present case studies that illustrate how the tool was used to explore a large open urban data repository.
KW - Automatic Type Detection
KW - Dataset Analysis
KW - Metadata Extractionl
UR - http://www.scopus.com/inward/record.url?scp=84968546467&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84968546467&partnerID=8YFLogxK
U2 - 10.1145/2740908.2742135
DO - 10.1145/2740908.2742135
M3 - Conference contribution
AN - SCOPUS:84968546467
T3 - WWW 2015 Companion - Proceedings of the 24th International Conference on World Wide Web
SP - 1389
EP - 1394
BT - WWW 2015 Companion - Proceedings of the 24th International Conference on World Wide Web
PB - Association for Computing Machinery, Inc
T2 - 24th International Conference on World Wide Web, WWW 2015
Y2 - 18 May 2015 through 22 May 2015
ER -