Inference for EDA
Seminar reading suggestion
Andreas Buja, Di Cook and others have a recent paper in the Philosophical Transactions of the Royal Society A, discussing the idea of statistical inference for exploratory data analysis and model diagnostics.
They propose to "furnish visual statistical methods with an inferential framework and protocol, modeled on confirmatory statistical testing. In this framework, plots take on the role of test statistics, and human cognition the role of statistical tests."
One paradigm is the "lineup," where a plot (some aspect) of the real data is assembled in an array of other similar plots using data generated randomly under a null hypothesis. The visual test statistic is essentially the probability that an observer can distinguish the real data from the simulated data.
- Supplementary materials - A collection of further examples. In each case you are asked to view a collection of plots, and decide which one "stands out" as different.
- Inference for Data Visualization Slides from Joint Statistics Meetings 1999 talk
- JCGS (2004) article by Andrew Gelman, "Exploratory Data Analysis for Complex Models," in which he proposes some similar ideas, using a Bayesian posterior predictive approach.
- Is this Sense or NonSense?
- What attributes of visual displays need to be considered in relation to their argument?