Modeling bicycle crash costs using big data: A grid-cell-based Tobit model with random parameters

Kun Xie, Kaan Ozbay, Di Yang, Chuan Xu, Hong Yang

Research output: Contribution to journalArticlepeer-review


Bicyclists are among the most vulnerable road users in the urban transportation system. It is critical to investigate the contributing factors to bicycle-related crashes and to identify the hotspots for efficient allocation of treatment resources. A grid-cell-based modeling framework was used to incorporate heterogeneous data sources and to explore the overall safety patterns of bicyclists in Manhattan, New York City. A random parameters (RP) Tobit model was developed in the Bayesian framework to correlate transportation, land use, and sociodemographic data with bicycle crash costs. It is worth mentioning that a new algorithm was proposed to estimate bicyclist-involved crash exposure using large-scale bicycle ridership data from 2014 to 2016 obtained from Citi Bike, which is the largest bicycle sharing program in the United States. The proposed RP Tobit model could deal with left-censored crash cost data and was found to outperform the Tobit model by accounting for the unobserved heterogeneity across neighborhoods. Results indicated that bicycle ridership, bicycle rack density, subway ridership, taxi trips, bus stop density, population, and ratio of population under 14 were positively associated with bicycle crash cost, whereas residential ratio and median age had a negative impact on bicycle crash cost. The RP Tobit model was used to estimate the cell-specific potential for safety improvement (PSI) for hotspot ranking. The advantages of using the RP Tobit crash cost model to capture PSI are that injury severity is considered by being converted into unit costs, and varying effects of certain explanatory variables are accounted for by incorporating random parameters. The cell-based hotspot identification method can provide a complete risk map for bicyclists with high resolution. Most locations with high PSIs either had unprotected bicycle lanes or were close to the access points to protected bicycle routes.

Original languageEnglish (US)
Article number102953
JournalJournal of Transport Geography
StatePublished - Feb 2021


  • Bicycle crashes
  • Big data
  • Hotspot identification
  • Random parameters Tobit model
  • Safety analysis

ASJC Scopus subject areas

  • Geography, Planning and Development
  • Transportation
  • General Environmental Science


Dive into the research topics of 'Modeling bicycle crash costs using big data: A grid-cell-based Tobit model with random parameters'. Together they form a unique fingerprint.

Cite this