Confidence set for group membership



We develop new procedures to quantify the statistical uncertainty of data-driven clustering algorithms. In our panel setting, each unit belongs to one of a finite number of latent groups with group-specific regression curves. We propose methods for computing unit-wise and joint confidence sets for group membership. The unit-wise sets give possible group memberships for a given unit and the joint sets give possible vectors of group memberships for all units. We also propose an algorithm that can improve the power of our procedures by detecting units that are easy to classify. The confidence sets invert a test for group membership that is based on a characterization of the true group memberships by a system of moment inequalities. To construct the joint confidence, we solve a high-dimensional testing problem that tests group membership simultaneously for all units. We justify this procedure under N,T→∞  asymptotics where we allow T  to be much smaller than N . As part of our theoretical arguments, we develop new simultaneous anti-concentration inequalities for the MAX and the QLR statistics. Monte Carlo results indicate that our confidence sets have adequate coverage and are informative. We illustrate the practical relevance of our confidence sets in two applications.