Data collected using intelligent transportation systems (ITS) technologies typically do not include information about the type of associated activity. For instance, cell-phone data typically consist of locations and time stamps of calls made and received. However, the corresponding activity-based travel patterns, such as working or shopping, are not as clear. Similarly, corresponding periodic variations of travel patterns as well as the distinct behavioral clusters of travelers are not explicitly provided in these data sets. Traditionally, these inexplicit (i.e., hidden patterns) are extracted by developing probabilistic models that attempt to explain the observed data. These models assume that the data were generated by an underlying latent process with parameters that can be inferred. In travel demand modeling, these underlying processes are the activities of travelers.
The existing literature has focused primarily on the use of parametric models to estimate travel activities. Hence, a priori bounds on model complexity and size (i.e. the number of latent activities) need to be imposed. In particular, inference using the standard expectation maximization (EM) algorithm involves pre-specifying the total number of unknown latent activities. Choosing very low bounds results in simple models that underestimate the number of actual activities. On the other hand, choosing too many activities may result in overly complex models that overfit or are expensive to compute. Hence, parametric models are not necessarily the best to capture unknown latent travel activities when using such ITS data sources as cell phone and GIS