Paradoxes, Fallacies and Other Surprises
An attempt at a taxonomy of statistical paradoxes, fallacies and other surprises.
- Florida capital sentencing of convicted murderers: Suppression effect of a confounder: No marginal relationship between race of accused and rate of capital sentencing but the relationship becomes very strong when controlling for a confounding factor: the race of victim.
- Berkeley graduate admissions: Overall lower rate of acceptance for female candidates but no or little gender effect within departments. Women tend to apply to departments that are harder to get into (have low admission rates) for both men and women. By controlling for departments the appearance of gender discrimination disappears. But is this the right analysis? That depends on the mechanism through which discrimination occurs. If university budgeting decision systematically favour departments that teach topics that are more appealing to men, then departments is a mediator and controlling for departments masks the effect of discrimination. We need different models to identify 'micro-discrimination' at the departmental level and 'macro-discrimination' at the university level. This is analogous to asking whether department should be treated as a confounder (and included in the model) or as a mediator (and excluded) in order to identify a causal effect of gender. Conditioning is not automatically the right thing to do.
Example of conditioning or selection on a collider variable.
Discrepancy between global and local relationships. Some classical examples:
- Globally the distribution of heights can remain the same from generation although tall parents get the impression that their children are shorter than themselves and short parents get the impression that their children are taller than themselves on average. Also, tall children get the impression that their parents are shorter than themselves and short children get the impression that their parents are taller than themselves on average. Thus parents get the impression, perfectly legitimately from their point of view, that the distribution of heights is being compressed towards the mean and children get the impression, also perfectly legitimately, that their parents' heights were more compressed towards the mean.
- Kahneman's pilot instructors got the impression that criticism improved performance of student pilots while praise made it worse. Kahneman thought the causal effect should be in the opposite direction. Regression to the mean allows one to see how Kahneman's belief and the pilot instructors' impression are not inconsistent. The resolution of the paradox lies partly in realizing that the the instructors are noticing an 'observational' relationship that is in the opposite direction to the possible causal relationship. Thus there's a connection with Simpson's Paradox.
When comparing two groups using a pretest and a posttest, should we use gain scores or a regression using the pretest as a covariate?
Base rate paradoxes
Prosecutor's Fallacy: The p-value to test the hypothesis that Sally Clark was innocent of the murder of her two children could legitimately be interpreted as being in the vicinity of 1/100,000. However there's also a legitimate argument that the probability of her innocence is very close to 1. These two results seemingly contradictory results are not inconsistent with each other.
Representativeness heuristic: This is a concept formulated by Tversky and Kahneman. One could view the heuristic as amounting to forming judgements based on relative likelihood, thus ignoring the base rate or 'prior', the foundational fallacy of frequentism.
Also known as the Monty Hall problem or the Principle of Restricted Choice in bridge. This is a very revealing paradox that illustrates the importance of taking into account the probabilistic mechanism that generated information, in addition to the information itself, if the information does not induce a partition of the space of possibilities.
It illustrates the crucial role of statistical modelling, but perhaps not in a way that is supportive of frequentist inference.
Students at a university report an average class size of 100. Professors report an average class size of 50. Are students likely to be exaggerating and professors underestimating the size of their classes?
Paradoxes of measures of central tendency
A random sample of 100 taxpayers reveals an average income of $30,000 although the government knows that the average income is $60,000. Is the sample likely to be biased and/or respondents understating their income? Or is there another plausible explanation?
It is possible to build a model which the parameter space and the support are both equal to the natural numbers and in which a conference procedure that has at least 2/3 probability of coverage for all θ (i.e. 2/3 confidence) has only 1/3 posterior probability for all y under a uniform prior for θ. Thus confidence and credibility can be strongly inconsistent in contrast with the intuition based on compact models in which mean credibility must equal mean confidence.