TY - GEN
T1 - Predicting taxi demand at high spatial resolution
T2 - 4th IEEE International Conference on Big Data, Big Data 2016
AU - Zhao, Kai
AU - Khryashchev, Denis
AU - Freire, Juliana
AU - Silva, Claudio
AU - Vo, Huy
PY - 2016
Y1 - 2016
N2 - In big cities, taxi service is imbalanced. In some areas, passengers wait too long for a taxi, while in others, many taxis roam without passengers. Knowledge of where a taxi will become available can help us solve the taxi demand imbalance problem. In this paper, we employ a holistic approach to predict taxi demand at high spatial resolution. We showcase our techniques using two real-world data sets, yellow cabs and Uber trips in New York City, and perform an evaluation over 9,940 building blocks in Manhattan. Our approach consists of two key steps. First, we use entropy and the temporal correlation of human mobility to measure the demand uncertainty at the building block level. Second, to identify which predictive algorithm can approach the theoretical maximum predictability, we implement and compare three predictors: the Markov predictor (a probability-based predictive algorithm), the Lempel-Ziv-Welch predictor (a sequence-based predictive algorithm), and the Neural Network predictor (a predictive algorithm that uses machine learning). The results show that predictability varies by building block and, on average, the theoretical maximum predictability can be as high as 83%. The performance of the predictors also vary: the Neural Network predictor provides better accuracy for blocks with low predictability, and the Markov predictor provides better accuracy for blocks with high predictability. In blocks with high maximum predictability, the Markov predictor is able to predict the taxi demand with an 89% accuracy, 11% better than the Neural Network predictor, while requiring only 0.03% computation time. These findings indicate that the maximum predictability can be a good metric for selecting prediction algorithms.
AB - In big cities, taxi service is imbalanced. In some areas, passengers wait too long for a taxi, while in others, many taxis roam without passengers. Knowledge of where a taxi will become available can help us solve the taxi demand imbalance problem. In this paper, we employ a holistic approach to predict taxi demand at high spatial resolution. We showcase our techniques using two real-world data sets, yellow cabs and Uber trips in New York City, and perform an evaluation over 9,940 building blocks in Manhattan. Our approach consists of two key steps. First, we use entropy and the temporal correlation of human mobility to measure the demand uncertainty at the building block level. Second, to identify which predictive algorithm can approach the theoretical maximum predictability, we implement and compare three predictors: the Markov predictor (a probability-based predictive algorithm), the Lempel-Ziv-Welch predictor (a sequence-based predictive algorithm), and the Neural Network predictor (a predictive algorithm that uses machine learning). The results show that predictability varies by building block and, on average, the theoretical maximum predictability can be as high as 83%. The performance of the predictors also vary: the Neural Network predictor provides better accuracy for blocks with low predictability, and the Markov predictor provides better accuracy for blocks with high predictability. In blocks with high maximum predictability, the Markov predictor is able to predict the taxi demand with an 89% accuracy, 11% better than the Neural Network predictor, while requiring only 0.03% computation time. These findings indicate that the maximum predictability can be a good metric for selecting prediction algorithms.
KW - human mobility
KW - limit of predictability
KW - predictive algorithm
KW - spatiotemporal data
KW - taxi demand prediction
UR - http://www.scopus.com/inward/record.url?scp=85015259491&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85015259491&partnerID=8YFLogxK
U2 - 10.1109/BigData.2016.7840676
DO - 10.1109/BigData.2016.7840676
M3 - Conference contribution
AN - SCOPUS:85015259491
T3 - Proceedings - 2016 IEEE International Conference on Big Data, Big Data 2016
SP - 833
EP - 842
BT - Proceedings - 2016 IEEE International Conference on Big Data, Big Data 2016
A2 - Ak, Ronay
A2 - Karypis, George
A2 - Xia, Yinglong
A2 - Hu, Xiaohua Tony
A2 - Yu, Philip S.
A2 - Joshi, James
A2 - Ungar, Lyle
A2 - Liu, Ling
A2 - Sato, Aki-Hiro
A2 - Suzumura, Toyotaro
A2 - Rachuri, Sudarsan
A2 - Govindaraju, Rama
A2 - Xu, Weijia
PB - Institute of Electrical and Electronics Engineers Inc.
Y2 - 5 December 2016 through 8 December 2016
ER -