Understanding Potential Outliers using the O3 Plot


Speaker


Abstract

Outliers may be important, in error, or irrelevant, but they are tricky to identify and deal with.  Whether a case is identified as an outlier depends on the other cases in the dataset, on the variables available, and on the criteria used.  A case can stand out as unusual on one or two variables, while appearing middling on the others.  If a case is identified as an outlier, it is useful to find out why. This talk discusses the O3 plot (Overview Of Outliers) for supporting outlier analyses.  O3 plots show which cases are often identified as outliers, which are identified in single dimensions, and which are only identified in higher dimensions.  They highlight which variables and combinations of variables may be affected by possible outliers.  Applications include a demographic dataset for the Bundestag constituencies in Germany and a university ranking dataset.