Ep 26: Deborah Mayo on Error, Replication, and Severe Testing

Deborah G. Mayo is professor emerita in the department of philosophy at Virginia Tech, a research associate at the London School of Economics, and a pioneer of the “Error Stats” method for testing scientific claims. We discuss the history of the problem of induction, her developed approach to scientific claims, and ideas from her most recent book, “Statistical Inference as Severe Testing”.

Related links:

Error Statistics Blog


Deborah Mayo’s publications

My analysis of the global warming data

Statistical Inference As Severe Testing by Deborah G. Mayo (2018)

Error and Inference: Recent Exchanges on Experimental Reasoning, Reliability, and the Objectivity and Rationality of Science by Deborah G. Mayo & Aris Spanos (2009)

One thought on “Ep 26: Deborah Mayo on Error, Replication, and Severe Testing

  1. The problem with the data dredger’s inference is not that it uses a method with poor long-run error control. It should be clear from the replication crisis that what bothers us about P-hackers and data dredgers is that they have done a poor job in the case at hand. They have found data to agree with a hypothesized effect, but they did so by means of a method that very probably would have found some such effect (or other) even if spurious. As Popper would say, the inference has passed a weak, and not a One final remark: It’s important to see that in many contexts the “same” data can be used to erect a model or claim as well provide a warranted test of the claim. (I put quotes around “same”, because the data are actually remodeled.) Examples include: using data to test statistical model assumptions, DNA matching, reliable estimation procedures. It may even be guaranteed that a method will output a claim or model in accordance with data. The problem is not guaranteeing agreement between data and a claim, the problem is doing so even though the claim is false or specifiably false. We shouldn’t confuse cases where we’re trying to determine if there even is a real effect that needs explaining—arguably, the key role for statistical significance tests and cases where we have a

Leave a Reply

Your email address will not be published. Required fields are marked *