1. Some Historical Context

A powerful language with a troubling origin

Modern statistical inference methods (confidence intervals and hypothesis tests) have proven to be powerful tools across many disciplines. Some of the most impressive real-world examples of the use of these methods (in my opinion) include the discovery of the Higgs Boson and the development of the first drug to treat HIV infection. These methods are useful in day-to-day social, scientific, and political work as well. For instance, sampling distribution theory can be used to target disaster relief efforts and provide quantitative evidence of discrimination or bias.

However, these methods of inference also grew out of the eugenics movement. In fact - “The history of statistics includes many regrettable elements related to systemic racism and eugenics. A recent article in Nautilus, entitled How Eugenics Shaped Statistics, is sobering, to put it mildly.”

Source: 2021 paper on Inclusivity in Statistics and Data Science Education

What are eugenics and scientific racism?

Eugenics is the scientifically erroneous and immoral theory of “racial improvement” and “planned breeding,” which gained popularity during the early 20th century. Eugenicists worldwide believed that they could perfect human beings and eliminate so-called social ills through genetics and heredity. They believed the use of methods such as involuntary sterilization, segregation and social exclusion would rid society of individuals deemed by them to be unfit.

Scientific racism is an ideology that appropriates the methods and legitimacy of science to argue for the superiority of white Europeans and the inferiority of non-white people whose social and economic status have been historically marginalized. Like eugenics, scientific racism grew out of:

the misappropriation of revolutionary advances in medicine, anatomy and statistics during the 18th and 19th centuries.
Charles Darwin’s theory of evolution through the mechanism of natural selection.
Gregor Mendel’s laws of inheritance.

Eugenic theories and scientific racism drew support from contemporary xenophobia, antisemitism, sexism, colonialism and imperialism, as well as justifications of slavery, particularly in the United States.

Source: https://www.genome.gov/about-genomics/fact-sheets/Eugenics-and-Scientific-Racism

Modern landscape: Statistics as a language of science

“[No] individual statistical analysis should be considered sufficient to establish scientific validity: research requires many sets of data along many lines of evidence, with a watchfulness for systematic error. Replicating and predicting findings in new data and new settings is a stronger way of validating claims than blessing results from an isolated study with statistical inferences.”

“Statistical thinking also involves a keen awareness of the pitfalls of data analysis and its interpretation”

“Statistical significance is not a measure of practical, clinical, or scientific significance”

Source: Statistical Inference Enables Bad Science; Statistical Thinking Enables Good Science

2. Principles and Guidelines

In 2016, the American Statistical Association developed (and has continued to revisit and update) the following Ethical Guidelines for Statistical Practice. These guidelines are special in that they are not meant only for professional statisticians. These guidelines are designed to assist any statistical practitioner who wishes to use the language of statistics in an ethical manner.

PRINCIPLE A: Professional Integrity and Accountability

Takes responsibility for evaluating potential tasks, assessing whether they have (or can attain) sufficient competence to execute each task and that the work and timeline are feasible. Does not solicit or deliver work for which they are not qualified or that they would not be willing to have peer reviewed.
Uses methodology and data that are valid, relevant, and appropriate, without favoritism or prejudice, and in a manner intended to produce valid, interpretable, and reproducible results.
Does not knowingly conduct statistical practices that exploit vulnerable populations or create or perpetuate unfair outcomes.
Opposes efforts to predetermine or influence the results of statistical practices and resists pressure to selectively interpret data.
Accepts full responsibility for their own work, does not take credit for the work of others, and gives credit to those who contribute. Respects and acknowledges the intellectual property of others.
Strives to follow, and encourages all collaborators to follow, an established protocol for authorship. Advocates for recognition commensurate with each person’s contribution to the work. Recognizes that inclusion as an author does imply, while acknowledgement may imply, endorsement of the work.
Discloses conflicts of interest, financial and otherwise, and manages or resolves them according to established policies, regulations, and laws.
Promotes the dignity and fair treatment of all people. Neither engages in nor condones discrimination based on personal characteristics. Respects personal boundaries in interactions and avoids harassment, including sexual harassment, bullying, and other abuses of power or authority.
Takes appropriate action when aware of deviations from these guidelines by others.
Acquires and maintains competence through upgrading of skills as needed to maintain a high standard of practice.
Follows applicable policies, regulations, and laws relating to their professional work, unless there is a compelling ethical justification to do otherwise.
Upholds, respects, and promotes these guidelines. Those who teach, train, or mentor in statistical practice have a special obligation to promote behavior that is consistent with these guidelines.

PRINCIPLE B: Integrity of Data and Methods

Communicates data sources and fitness for use, including data generation and collection processes and known biases. Discloses and manages any conflicts of interest relating to the data sources. Communicates data processing and transformation procedures, including missing data handling.
Is transparent about assumptions made in the execution and interpretation of statistical practices, including methods used, limitations, possible sources of error, and algorithmic biases. Conveys results or applications of statistical practices in ways that are honest and meaningful.
Communicates the stated purpose and the intended use of statistical practices. Is transparent regarding a priori versus post hoc objectives and planned versus unplanned statistical practices. Discloses when multiple comparisons are conducted and any relevant adjustments.
Meets obligations to share the data used in the statistical practices (e.g., for peer review and replication) as allowable. Respects expectations of data contributors when using or sharing data. Exercises due caution to protect proprietary and confidential data, including all data that might inappropriately harm data subjects.
Strives to promptly correct substantive errors discovered after publication or implementation. As appropriate, disseminates the correction publicly and/or to others relying on the results.
For models and algorithms designed to inform or implement decisions repeatedly, develops and/or implements plans to validate assumptions and assess performance over time, as needed. Considers criteria and mitigation plans for model or algorithm failure and retirement.
Explores and describes the effect of variation in human characteristics and groups on statistical practice when feasible and relevant.

PRINCIPLE C: Responsibilities to Stakeholders

Seeks to establish what stakeholders hope to obtain from any specific project. Strives to obtain sufficient subject-matter knowledge to conduct meaningful and relevant statistical practice.
Regardless of personal or institutional interests or external pressures, does not use statistical practices to mislead any stakeholder.
Uses practices appropriate to exploratory and confirmatory phases of a project, differentiating findings from each so the stakeholders can understand and apply the results.
Informs stakeholders of the potential limitations on use and re-use of statistical practices in different contexts and offers guidance and alternatives, where appropriate, about scope, cost, and precision considerations that affect the utility of the statistical practice.
Explains any expected adverse consequences from failing to follow through on an agreed-upon sampling or analytic plan.
Strives to make new methodological knowledge widely available to provide benefits to society at large. Presents relevant findings, when possible, to advance public knowledge.
Understands and conforms to confidentiality requirements for data collection, release, and dissemination and any restrictions on its use established by the data provider (to the extent legally required). Protects the use and disclosure of data accordingly. Safeguards privileged information of the employer, client, or funder.
Prioritizes both scientific integrity and the principles outlined in these guidelines when interests are in conflict.

PRINCIPLE D: Responsibilities to Research Subjects, Data Subjects, or Those Directly Affected by Statistical Practices

Keeps informed about and adheres to applicable rules, approvals, and guidelines for the protection and welfare of human and animal subjects. Knows when work requires ethical review and oversight.
Makes informed recommendations for sample size and statistical practice methodology to avoid the use of excessive or inadequate numbers of subjects and excessive risk to subjects.
For animal studies, seeks to leverage statistical practice to reduce the number of animals used, refine experiments to increase the humane treatment of animals, and replace animal use where possible.
Protects people’s privacy and the confidentiality of data concerning them, whether obtained from the individuals directly, other persons, or existing records. Knows and adheres to applicable rules, consents, and guidelines to protect private information.
Uses data only as permitted by data subjects’ consent, when applicable, or considers their interests and welfare when consent is not required. This includes primary and secondary uses, use of repurposed data, sharing data, and linking data with additional data sets.
Considers the impact of statistical practice on society, groups, and individuals. Recognizes that statistical practice could adversely affect groups or the public perception of groups, including marginalized groups. Considers approaches to minimize negative impacts in applications or in framing results in reporting.
Refrains from collecting or using more data than is necessary. Uses confidential information only when permitted and only to the extent necessary. Seeks to minimize the risk of re-identification when sharing de-identified data or results where there is an expectation of confidentiality. Explains any impact of de-identification on accuracy of results.
To maximize contributions of data subjects, considers how best to use available data sources for exploration, training, testing, validation, or replication as needed for the application. The ethical statistical practitioner appropriately discloses how the data is used for these purposes and any limitations.
Knows the legal limitations on privacy and confidentiality assurances and does not over-promise or assume legal privacy and confidentiality protections where they may not apply.
Understands the provenance of the data—including origins, revisions, and any restrictions on usage—and fitness for use prior to conducting statistical practices.
Does not conduct statistical practice that could reasonably be interpreted by subjects as sanctioning a violation of their rights. Seeks to use statistical practices to promote the just and impartial treatment of all individuals.

3. Examples from this semester

Visualizing Numeric Data

We’ve seen how histograms are useful descriptors for numeric data but how they can vary widely depending on the sample size and the bin size. We discussed how in descriptive analyses it is important to have both visual and mathematical summaries for numeric data. This came up again in Unit 3 when we need to assess the “nearly Normal” condition for inference for an unknown mean with a small sample size.

Here’s a Reddit page that’s dedicated to bad data visualizations. Take a look around and see what you think about some of the top examples on this page.

Organizing and Collecting Data

Considering Categorical Variables

How are the levels of a categorical variable determined?

Social and cultural context
Motivation of the main research question

Researchers must often consider a balance between transparency of their research methods and data privacy. (For example, researchers must be careful about data identifiability.) Another careful balance in statistical analyses involves respecting personal identities while still gathering useful information from data.

Race

Eugenics and scientific racism statement from the National Human Genome Research Institute

American Society of Human Genetics Statement Regarding Concepts of “Good Genes” and Human Genetics

Gender or sex

A call for better practices

A rebuttal

Sex and gender analysis improves science and engineering

Here’s an example of a chi-squared procedure that can yeild very different results depending on how gender is categorized.

Assumptions and Conditions in Statistical Inference

All statistical inference methods rely on some one (or more) assumption(s). One of the fundamental assumptions that can be easily overlooked is in assessing whether or not the sample of data being analyzed is actually representative of the population to which the results are being generalized. By definition, an assumption is not something that we can ever prove is true. We may be able to disprove an assumption with a clever counterexample. In some sense, playing “the devil’s advocate” is an important part of robust statistical inference. An analyst who earnestly tries to find counterexamples of a required assumption (like that of having independent data), will be able to present an analysis within the context of its limitations. A statistical analysis conducted with this kind of rigor and shared transparently ultimately improves upon the quality of science. However, a lack of transparency can often lead to better individual outcomes such as more publications and funding awards, at least in the short-term. For example here’s a case of a bad statistical analysis (and some more writing on this example for those who are interested here and here) (finding a confidence interval for a difference in proportions).

Q) What do you think is an effective way to hold statistical practitioners accountable for their work?

Stat 11 Week 15

Statistics in the Real World

Prof. Suzy Thornton