class: center, middle, inverse, title-slide # Simpson’s Paradox ### STAT 021 with Prof Suzy ### Swarthmore College --- <style type="text/css"> pre { background: #FFBB33; max-width: 100%; overflow-x: scroll; } .scroll-output { height: 70%; overflow-y: scroll; } .scroll-small { height: 30%; overflow-y: scroll; } .red{color: #ce151e;} .green{color: #26b421;} .blue{color: #426EF0;} </style> ## Simpson's Paradox When averages are taken across different groups, they can appear to contradict the overall averages. This phenomena is called Simpson's Paradox. We have to be careful when averaging across different levels of a second (categorical) variable. If we unthinkingly combine group data, we risk reaching the wrong conclusion. Simpson's paradox illustrates the importance of **domain knowledge** in the context of statistical analysis. These trend reversals can occur if there is another important variable that has been excluded from the analysis. Such variables are refered to as **latent variables**. As a general rule, if an overall trend is surprising or unexpected, it is worth considering how you might stratify the data according to another important (categorical) variable. The goal is to create groups with similar values. This is OK! This is not p-hacking! It is part of the data exploration process. --- ## Simpson's Paradox ### Illustrative example <img src="Figs/simpsons1.png" width="700" height="400" style="display: block; margin: auto;" /> .footnote[Source: https://qr.ae/pN4Ysk] --- ## Simpson's Paradox ### Illustrative example <img src="Figs/simpsons2.png" width="879" height="400" style="display: block; margin: auto;" /> .footnote[Source: https://qr.ae/pN4Ysk] --- ## Real World Example ### Death penalty in FL | | | Defendant's race | |---------|------|-------| | | White | Black | | Death | 53 | 15 | | No death | 430 | 176 | --- ## Real World Example ### Death penalty in FL **Victim's race: white** | | | Defendant's race | |---------|------|-------| | | White | Black | | Death | 53 | 11 | | No death | 414 | 37 | **Victim's race: Black** | | | Defendant's race | |---------|------|-------| | | White | Black | | Death | 0 | 4 | | No death | 16 | 139 |