Empirical identification in the mixed logit model: Analysing the effect of data richness

Elisabetta Cherchi, Juan de Dios Ortúzar

Research output: Contribution to journalArticlepeer-review

Abstract

The flexible structure of the mixed logit (ML) model is at the root of the difficulties associated to its estimation. Major problems are parameter identification and the distinction between different substitution patterns. In this paper we focus on the empirical identification problem and investigate the effect of low information richness in the data on the capability of estimating a correct ML model (i.e. with identifiable parameters and free of confounding effects). In particular, we analyse to which extent the empirical identification problem depends on the variability of the data among alternatives, on the degree of heterogeneity of the taste parameters, on the dimension of the sample and on the number of choice tasks for each individual. To test for information richness of the data and its effect on the capability of the ML model to reproduce random heterogeneity in tastes, a collection of datasets was generated varying systematically (a) the standard deviation (SD) of the distribution of travel time differences between the two alternatives, (b) the SD of the random parameter, (c) the number of choice tasks for each individual and (d) the number of individuals in relation to the number of choice tasks. Then, several ML models allowing for random travel time parameters were estimated using different number of draws and results were compared in terms of model goodness of fit and, also, on the capability of reproducing the real parameters used to generate each dataset. Our results suggest that identification problems depend only on the (low) variability of the associated data and disappear as the richness of the data associated to the random parameter increases. However, rich enough data only allows obtaining good statistics but the estimated parameters do not always reproduce the correct values, as the capability of the ML to reproduce random heterogeneity depends on the random parameter distribution (degree of variability and symmetry). Moreover, the capability of the ML to reproduce random heterogeneity increases when more than one choice is available for each individual and the effect of sample size on the empirical identification reduces considerably.

Original languageEnglish (US)
Pages (from-to)109-124
Number of pages16
JournalNetworks and Spatial Economics
Volume8
Issue number2-3
DOIs
StatePublished - Sep 2008

Keywords

  • Data richness
  • Empirical identification
  • Mixed Logit model
  • Repeated choice tasks

ASJC Scopus subject areas

  • Software
  • Computer Networks and Communications
  • Artificial Intelligence

Fingerprint

Dive into the research topics of 'Empirical identification in the mixed logit model: Analysing the effect of data richness'. Together they form a unique fingerprint.

Cite this