Segmentation and Dimension Reduction: Exploratory and Model-Based Approaches Defended on Thursday, 9 April 2009

Representing the information in a data set in a concise way is an important part of data analysis. A variety of multivariate statistical techniques have been developed for this purpose, such as k-means clustering and principal components analysis. These techniques are often based on the principles of segmentation (partitioning the observations into distinct groups) and dimension reduction (constructing a low-dimensional representation of a data set). However, such techniques typically make no statistical assumptions on the process that generates the data; as a result, the statistical significance of the results is often unknown.

In this thesis, we incorporate the modeling principles of segmentation and dimension reduction into statistical models. We thus develop new models that can summarize and explain the information in a data set in a simple way. The focus is on dimension reduction using bilinear parameter structures and techniques for clustering both modes of a two-mode data matrix. To illustrate the usefulness of the techniques, the thesis includes a variety of empirical applications in marketing, psychometrics, and political science. An important application is modeling the response behavior in surveys with rating scales, which provides novel insight into what kinds of response styles exist, and how substantive opinions vary among respondents. We find that our modeling approaches yield new techniques for data analysis that can be useful in a variety of applied fields.

Keywords

bilinear decomposition, conjoint analysis, dimension reduction, finite mixture modeling, graphical representation, optimal scaling, response styles, segmentation, two-mode partitioning


  • Share on