Properties of Inverse Correspondence Analysis



Correspondence analysis (CA) is a dimension reduction technique for categorical data in a two-way contingency matrix. The aim is to optimally depict the relationship between categories for both variables in a low-dimensional representation. We study the inverse correspondence analysis (ICA) problem, which uses the low-dimensional CA solution to retrieve the original data matrix. Researchers are not allowed to share the original data set if it contains sensitive information. Assuming that the original data set can be retrieved using ICA, researchers should be careful with disclosing CA results. In previous empirical research the retrieved ICA solution always corresponds to the original data matrix. This is an unexpected result, because CA is a dimension reduction technique. Our contribution is twofold. First, we derive theoretically that ICA solutions are not always unique. We introduce matrices with a specific singular value structure for which the ICA problem has more than one solution. Secondly, an integer programming formulation is given to solve the ICA problem. We observe that not all constraints are needed to solve the model and propose a method that temporarily removes constraints. On basis of computational results we show that this method can solve larger instances compared to previous work, while still retrieving the original data.

Zoom link:

Meeting ID: 918 7611 6089