Moving beyond P values: Everyday data analysis with estimation plots

Joses Ho, Tayfun Tumkaya, Sameer Aryal, Hyungwon Choi, Adam Claridge-Chang

Posted on: 1 August 2018 , updated on: 3 August 2018

Preprint posted on 26 July 2018

Article now published in Nature Methods at http://dx.doi.org/10.1038/s41592-019-0470-3

A visual, intuitive and widely accessible tool could finally help us move from asking “does it?” to “how much?”

Selected by Dey Lab

Categories: cancer biology, cell biology, clinical trials, developmental biology, molecular biology, neuroscience, scientific communication and education

Context

Statistical analysis in the biological sciences has long been dominated by null-hypothesis significance testing (NHST). Statisticians and quantitatively-minded biologists alike have been crying themselves hoarse about the fallacies and intrinsic limitations associated with this approach for, believe it or not, approximately 75 years^1,2. Unfortunately, there has been little consensus on the practical steps needed to achieve significant reform.

The authors illustrate the key limitations of NHST as well as their proposed solution using an experimental setup we are all too familiar with: one containing two groups of data points, representing a control and a test/intervention sample. Such an experiment would be traditionally visualized using bar graphs (Fig 1A), box plots (Fig 1B) or perhaps scatterplots (Fig 1C) and analyzed by a Student’s t-test or related NHST variant.

Figure 1: Reproduced from Figure 1 of Ho et al. 2018 under a CC-BY-NC-ND 4.0 international license. 2-groups data represented by bar plots (A), box plots (B), and scatter plots with jitter (C). (D) Histogram-like scatter plots with jitter, with null-hypothesis distribution and p-value (red segment). (E) Estimation plot with difference of means distribution and 95% CI (red line).

What is wrong with the status quo?

The NHST focuses purely on a binary decision³ to accept or reject the null hypothesis (that the means of both groups are identical) and diverts attention away from the actual effect size; this is emphasized by bar plots and only moderately mitigated by box and scatter plots.
Visualizing the null distribution and the p-value threshold (Fig 1D, red tail) helps drive home the issues with NHST. First, even an infinitesimally small intervention to any real system will produce at least some effect, making the zero-effect hypothesis intrinsically flawed⁴. Second, since the p-value threshold (usually 0.05) actually lies within the tail of the null distribution, we are concluding that control and test samples are different by demonstrating that they are sometimes the same!

How to fix it?

Estimation plots focus on the difference of means (Fig 1E). The visual representation helps focus attention on the effect size, which is what we (should) actually care about. The 95% confidence interval⁵ (red bar in Fig 1E), that encompasses the bulk of the ∆ sampling-error distribution (by definition), is more intuitively grasped and much better behaved than the p-value. In this case, we are concluding that control and test samples are different by demonstrating that they are almost always different.

Why I chose this preprint

I loved this preprint! The estimation plot provides a complete yet visually accessible description of the data- and working through the steps in Figure 1 has given me a visual framework to interpret what I thought I understood about hypothesis testing. More importantly, the authors go to great lengths to make estimation plotting broadly accessible- by providing 5 different ways in which to create them, ranging from Python code to a handy web tool that requires no programming experience whatsoever. Go ahead- try it out!

References:

Berkson, J. Tests of Significance Considered as Evidence. J. Am. Stat. Assoc. 37, 325–335 (1942).
Halsey, L. G., Curran-Everett, D., Vowler, S. L. & Drummond, G. B. The fickle P value generates irreproducible results. Nat. Methods 12, 179–185 (2015).
McShane, B. B. & Gal, D. Statistical Significance and the Dichotomization of Evidence. J. Am. Stat. Assoc. 112, 885–895 (2017).
Cohen, J. The earth is round (p < .05). Am. Psychol. 49, 997–1003 (1994).
Cumming, G. Understanding The New Statistics. (Routledge, 2011). doi:10.4324/9780203807002

Tags: quantitative biology, significance testing, statistics for biology

doi: https://doi.org/10.1242/prelights.4025

Read preprint

(5 votes)

A brief interview with the authors

Joses Ho and Adam Claridge-Chang shared

Could you tell us a little bit about how the project started? For example, was the tool a side effect of your ongoing work on estimation statistics, motivated by the needs of other research projects in the group, or a directed effort to address a general shortcoming in the field?

It started back when Adam and I overlapped at Oxford’s human genetics centre. It is a hub of activity around genome-wide association studies (GWAS), and uses a host of sophisticated statistical tools. As part of my PhD on language genetics, I became familiar with GWAS p-values and the odds ratio, a number GWAS uses to express relative disease risk. So that experience was my first contact with effect sizes.

Around the same time, Adam, who does experimental neurogenetics, was frustrated by the p-value rollercoaster that so many experience: one day a phenotype is significant, next day it isn’t. He had also heard about effect sizes at Oxford, and when he moved to Singapore took to time to read some text books on the topic, including Statistics with Confidence by Douglas Altman and others, and Geoff Cumming’s Understanding the New Statistics. The concepts and tools in those books are pretty eye-opening.

So when I graduated and returned to Singapore to start in Adam’s lab as the resident data scientist, he handed me a pile of these textbooks to read and retrain in estimation statistics. Since then, we have used meta-analysis (which is used widely in clinical settings) to synthesise thirty years of short-term memory in flies, and to systematically review over 300 preclinical studies in rodent anxiety. Our paper on fly anxiety-like behaviours used meta-analytic data to compare our results to rodent studies, and also used estimation statistics to analyse and present our results.

Adam also loans new lab members his well-worn copy of Edward Tufte’s The Visual Display of Quantitative Information, and our group makes an effort to apply Tuftian principles when working on figures for manuscripts. In early 2016, Adam remarked to me that the confidence intervals for the effect sizes could be depicted the way Gardner and Altman did in their textbook (See Figure 1 and 2 in this PDF), and also that we could use bootstrap methods to obtain the full effect-size distribution (‘∆ curve’).

The benefits of using the bootstrap were immediately obvious: we did not have to make assumptions about the underlying population (which Gardner, Altman, and Cumming do), and I could depict the confidence interval as a graded distribution, and so indicate a likelihood of values for the effect size rather than just a point estimate and hard error-bar boundaries.

I started writing a version in Python for internal lab use, and along the way we gave it the name Data Analysis with Bootstrap-coupled ESTimation (DABEST). The first version of DABEST and the webapp estimationstats.com were released in late 2017.

So it really grew out of our own frustration with significance testing, and a desire for better tools for ourselves. Then, once we were happy with it internally, it made sense to share it with everyone else.

As you discuss in your paper, statisticians and biologists alike have been working on alternatives to NHST statistics for years without any sort of consensus in the community. Do you think the easy accessibility and visual nature of your tool could help shift the balance a bit? Your preprint has already triggered significant discussion on social media platforms- do you think this could be leveraged into lasting impact?

Student’s t-test has incredible brand recognition among scientists, so a key motivation for the creation of the webapp was indeed an attempt to improve the branding of estimation methods. We’re not exactly marketing experts, but we hope that improving awareness and accessibility will encourage some to make the switch. Adam has given several talks in the past where he has tried to get scientists to use these estimation as an alternative to NHST. In doing this he realised he needed a simple handle people could easily grasp and remember, so decided on ‘estimation statistics’. I also attempted to get other laboratories to use my Python code, but the need to learn programming was a major barrier to adoption, so it became clear I needed to be able to say: “There’s an app for that.”

While we were targeting basic biomedical researchers, one surprise is that our tool has gotten a fair amount of interest from other areas: ecologists, sports scientists, psychologists and others. We do hope that estimation plots have the potential to change the data-analysis culture. Still, p-values having been under fire for over 75 years, and they are still going strong—so maybe we’ll be doomed to use them forever?

3. Anything else you’d like to tell us about the paper, estimationstats.com, or what’s next for you and your research group?

We’ve submitted the paper, and hope to see it in print, but are encouraged and pleased with the reception the preprint’s gotten. v0.1.4 of DABEST, which features aesthetic tweaks, will be released very shortly as well.

Have your say Cancel reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Sign up to customise the site to your preferences and to receive alerts

Also in the cancer biology category:

Mitochondria-derived nuclear ATP surge protects against confinement-induced proliferation defects

Ritobrata Ghose, Fabio Pezzano, Savvas Kourtis, et al.

Selected by 16 May 2024

Teodora Piskova

Spatial transcriptomics elucidates medulla niche supporting germinal center response in myasthenia gravis thymoma

Yoshiaki Yasumizu, Makoto Kinoshita, Martin Jinye Zhang, et al.

Selected by 27 March 2024

Jessica Chevallier

Discussion

Invasion of glioma cells through confined space requires membrane tension regulation and mechano-electrical coupling via Plexin-B2

Chrystian Junqueira Alves, Theodore Hannah, Sita Sadia, et al.

Selected by 13 February 2024

Jade Chan

Discussion

Also in the cell biology category:

Cell cycle-dependent mRNA localization in P-bodies

Adham Safieddine, Marie-Noëlle Benassy, Thomas Bonte, et al.

Selected by 11 July 2024

Mohammed JALLOH

Discussion

Control of Inflammatory Response by Tissue Microenvironment

Zhongyang Wu, Scott D. Pope, Nasiha S. Ahmed, et al.

Selected by 13 June 2024

Roberto Amadio

Discussion

Notch3 is a genetic modifier of NODAL signalling for patterning asymmetry during mouse heart looping

Tobias Holm Bønnelykke, Marie-Amandine Chabry, Emeline Perthame, et al.

Selected by 06 June 2024

Bhaval Parmar

Discussion

Also in the clinical trials category:

Modular control of time and space during vertebrate axis segmentation

Ali Seleit, Ian Brettell, Tomas Fitzgerald, et al.

AND

Natural genetic variation quantitatively regulates heart rate and dimension

Jakob Gierten, Bettina Welz, Tomas Fitzgerald, et al.

Selected by 24 June 2024

Girish Kale, Jennifer Ann Black

Therapeutic strategy for spinal muscular atrophy by combining gene supplementation and genome editing

Fumiyuki Hatanaka, Keiichiro Suzuki, Kensaku Shojima, et al.

Selected by 03 May 2023

Preethi Krishnaraj

Discussion

Bromodomain Inhibition Blocks Inflammation-Induced Cardiac Dysfunction and SARS-CoV2 Infection in Pre-Clinical Models

Richard J Mills, Sean J Humphrey, Patrick RJ Fortuna, et al.

Selected by 27 November 2020

Alexander Ward, Osvaldo Contreras

Also in the developmental biology category:

Gestational exposure to high heat-humidity conditions impairs mouse embryonic development

Avinchal Manhas, Amritesh Sarkar, Srimonta Gayen

Selected by 08 July 2024

Girish Kale, preLights peer support

Discussion

Modular control of time and space during vertebrate axis segmentation

Ali Seleit, Ian Brettell, Tomas Fitzgerald, et al.

AND

Natural genetic variation quantitatively regulates heart rate and dimension

Jakob Gierten, Bettina Welz, Tomas Fitzgerald, et al.

Selected by 24 June 2024

Girish Kale, Jennifer Ann Black

Notch3 is a genetic modifier of NODAL signalling for patterning asymmetry during mouse heart looping

Tobias Holm Bønnelykke, Marie-Amandine Chabry, Emeline Perthame, et al.

Selected by 06 June 2024

Bhaval Parmar

Discussion

Also in the molecular biology category:

Cell cycle-dependent mRNA localization in P-bodies

Adham Safieddine, Marie-Noëlle Benassy, Thomas Bonte, et al.

Selected by 11 July 2024

Mohammed JALLOH

Discussion

Fetal brain response to maternal inflammation requires microglia

Bridget Elaine LaMonica Ostrem, Nuria Dominguez Iturza, Jeffrey Stogsdill, et al.

Selected by 24 April 2024

Manuel Lessi

Discussion

Clusters of lineage-specific genes are anchored by ZNF274 in repressive perinucleolar compartments

Martina Begnis, Julien Duc, Sandra Offner, et al.

Selected by 10 April 2024

Silvia Carvalho

Also in the neuroscience category:

Sexually dimorphic role of diet and stress on behavior, energy metabolism, and the ventromedial hypothalamus

Sanutha Shetty, Samuel J. Duesman, Sanil Patel, et al.

Selected by 24 July 2024

Jimeng Li

Discussion

Enhancer-driven cell type comparison reveals similarities between the mammalian and bird pallium

Nikolai Hecker , Niklas Kempynck , David Mauduit, et al.

Selected by 02 July 2024

Rodrigo Senovilla-Ganzo

Autism gene variants disrupt enteric neuron migration and cause gastrointestinal dysmotility

Kate E. McCluskey, Katherine M. Stovell, Karen Law, et al.

Selected by 10 June 2024

Rachel Mckeown

Also in the scientific communication and education category:

Have AI-Generated Texts from LLM Infiltrated the Realm of Scientific Writing? A Large-Scale Analysis of Preprint Platforms

Huzi Cheng, Bin Sheng, Aaron Lee, et al.

Selected by 13 July 2024

Amy Manson et al.

Discussion

Sci-comm “behind the scenes”: Gendered narratives of scientific outreach activities in the life sciences

Perry G. Beasley-Hall, Pam Papadelos, Anne Hewitt, et al.

Selected by 10 July 2024

Martin Estermann et al.

Discussion

An analysis of the effects of sharing research data, code, and preprints on citations

Giovanni Colavizza, Lauren Cadwallader, Marcel LaFlamme, et al.

Selected by 05 July 2024

Reinier Prosee et al.

Discussion

preLists in the cancer biology category:

BSCB-Biochemical Society 2024 Cell Migration meeting

This preList features preprints that were discussed and presented during the BSCB-Biochemical Society 2024 Cell Migration meeting in Birmingham, UK in April 2024. Kindly put together by Sara Morais da Silva, Reviews Editor at Journal of Cell Science.

Moving beyond P values: Everyday data analysis with estimation plots

Share this:

Have your say Cancel reply

Sign up to customise the site to your preferences and to receive alerts

Also in the cancer biology category:

Mitochondria-derived nuclear ATP surge protects against confinement-induced proliferation defects

Spatial transcriptomics elucidates medulla niche supporting germinal center response in myasthenia gravis thymoma

Invasion of glioma cells through confined space requires membrane tension regulation and mechano-electrical coupling via Plexin-B2

Also in the cell biology category:

Cell cycle-dependent mRNA localization in P-bodies

Control of Inflammatory Response by Tissue Microenvironment

Notch3 is a genetic modifier of NODAL signalling for patterning asymmetry during mouse heart looping

Also in the clinical trials category:

Modular control of time and space during vertebrate axis segmentation

Natural genetic variation quantitatively regulates heart rate and dimension

Therapeutic strategy for spinal muscular atrophy by combining gene supplementation and genome editing

Bromodomain Inhibition Blocks Inflammation-Induced Cardiac Dysfunction and SARS-CoV2 Infection in Pre-Clinical Models

Also in the developmental biology category:

Gestational exposure to high heat-humidity conditions impairs mouse embryonic development

Modular control of time and space during vertebrate axis segmentation

Natural genetic variation quantitatively regulates heart rate and dimension

Notch3 is a genetic modifier of NODAL signalling for patterning asymmetry during mouse heart looping

Also in the molecular biology category:

Cell cycle-dependent mRNA localization in P-bodies

Fetal brain response to maternal inflammation requires microglia

Clusters of lineage-specific genes are anchored by ZNF274 in repressive perinucleolar compartments

Also in the neuroscience category:

Sexually dimorphic role of diet and stress on behavior, energy metabolism, and the ventromedial hypothalamus

Enhancer-driven cell type comparison reveals similarities between the mammalian and bird pallium

Autism gene variants disrupt enteric neuron migration and cause gastrointestinal dysmotility

Also in the scientific communication and education category:

Have AI-Generated Texts from LLM Infiltrated the Realm of Scientific Writing? A Large-Scale Analysis of Preprint Platforms

Sci-comm “behind the scenes”: Gendered narratives of scientific outreach activities in the life sciences

An analysis of the effects of sharing research data, code, and preprints on citations

preLists in the cancer biology category:

BSCB-Biochemical Society 2024 Cell Migration meeting

CSHL 87th Symposium: Stem Cells

Journal of Cell Science meeting ‘Imaging Cell Dynamics’

CellBio 2022 – An ASCB/EMBO Meeting

Fibroblasts

Single Cell Biology 2020

ASCB EMBO Annual Meeting 2019

Lung Disease and Regeneration

Anticancer agents: Discovery and clinical use

Biophysical Society Annual Meeting 2019

Also in the cell biology category:

BSCB-Biochemical Society 2024 Cell Migration meeting

‘In preprints’ from Development 2022-2023

preLights peer support – preprints of interest

The Society for Developmental Biology 82nd Annual Meeting

CSHL 87th Symposium: Stem Cells

Journal of Cell Science meeting ‘Imaging Cell Dynamics’

9th International Symposium on the Biology of Vertebrate Sex Determination

Alumni picks – preLights 5th Birthday

CellBio 2022 – An ASCB/EMBO Meeting

Fibroblasts

EMBL Synthetic Morphogenesis: From Gene Circuits to Tissue Architecture (2021)

FENS 2020

Planar Cell Polarity – PCP

BioMalPar XVI: Biology and Pathology of the Malaria Parasite

Cell Polarity

TAGC 2020

3D Gastruloids

ECFG15 – Fungal biology

ASCB EMBO Annual Meeting 2019

EMBL Seeing is Believing – Imaging the Molecular Processes of Life

Autophagy

Lung Disease and Regeneration

Cellular metabolism

BSCB/BSDB Annual Meeting 2019

MitoList

ASCB/EMBO Annual Meeting 2018

Also in the clinical trials category:

Autophagy

Antimicrobials: Discovery, clinical use, and development of resistance

Also in the developmental biology category:

BSDB/GenSoc Spring Meeting 2024

GfE/ DSDB meeting 2024

‘In preprints’ from Development 2022-2023

preLights peer support – preprints of interest