Close

A thirty-year trend of increasing clinical orientation at the National Institutes of Health

Brad L. Busse, James M. Tucker, Summer E. Allen, George M. Santangelo, Kristine A. Willis

Posted on: 11 March 2026

Preprint posted on 19 December 2025

and

Prediction of transformative breakthroughs in biomedical research

Matthew T. Davis, Brad L. Busse, Salsabil Arabi, Payam Meyer, Travis A. Hoppe, Rebecca A. Meseroll, B. Ian Hutchins, Kristine A. Willis, George M. Santangelo

Posted on:

Preprint posted on 17 December 2025

Mirror mirror on the wall, what is the fairest funding call?

Selected by Jonathan Townson

What drew me to these preprints

As an early-career researcher, I am currently going through the rounds of applying for postdoc positions or grants in the hope of securing a contract/funding for pursuing research that interests me in the future.  Whilst the analysis here does not offer the crystal ball I desperately yearn for to predict what will make my next application a success, they do show some interesting patterns, and I was excited to highlight these preprints and other similar analyses.

Background

These preprints from the groups of Kristine Willis and George Santangelo bring data and social science together to analyse how significant biological discoveries are made. Busse et al. took a data analysis approach to grants awarded by the NIH between 1986 and 2017 to describe trends in where funding is going; whilst Davis et al. analysed all publications on the PubMed database through to the end of 2017 to see if they show evidence of impending breakthroughs in biomedical research. Together these preprints investigate what happens before and after a research project is in process, investigating trends in research funding allocation and the impending signs from the literature of an impending breakthrough.

Key findings

Grant calls are increasingly orientated to clinical and translational research

Busse et al. focused their analysis of NIH funding on the so-called R01 grants, which support individual scientists research programmes, and compare these with non-R01 grants. They also looked at trends in unsolicited announcements, where the subject is investigator initiated, with solicited announcements where funding is offered to work in a specified area.

What Busse and team found first is that the number of solicited RO1 grants and applications plateaued in the mid-90s, whilst the number of non-RO1 grants continued to rise (fig. 1A and 1B). They attributed this to an increase in the raw number of non-R01 grants that were offered (fig. 1C). This highlights a shift in how NIH funding has been awarded, given that inflation-adjusted budgets stayed approximately constant.

Three graphs showing changes in the number of non-R01 calls and applications over time. The third is a graphic which includes a timeline of the introduction of non-R01 calls

Figure 1: The number of non-R01 grants has increased. A) the number of solicited funding opportunities for R01 and non-R01 grants announced by the NIH. B) The number of applications to solicited R01 and non-R01 grants. C) A timeline of the introduction of non-R01 grant mechanisms and the resulting change from 1985 to 2017 in the ratio of R01 to non-R01 grants awarded. Adapted from figures 1 and 2 of (Busse et al. 2025), made available under a CC-BY 4.0 International license.

The team next used various measures to look at the clinical orientation of funding announcements. They applied the Medical Subject Headings (MeSH) vocabulary maintained by the National Library of Medicine (NLM) to the text of announcements and then separated the output into three descriptors (animal, human, or molecular & cellular). They saw a large increase in the non-R01 grant calls for human MeSH terms whilst the R01 remained constant (fig. 2A). This was accompanied by an increase in the proportion of solicited non-R01 grants mentioning human subjects, whilst the proportion of unsolicited R01 grants fell (fig. 2B).

Two graphs showing changes in human terms and human subjects of grant calls between R01 and non-R01 calls

Figure 2: The NIH has issued more clinically oriented non-R01 grants. A) the human MeSH terms mentioned in R01 and non-R01 grants. B) Grant applications that mention human subjects broken down by solicited/unsolicited and R01/non-R01 grant mechanisms. Adapted from figure 4 of (Busse et al. 2025) , made available under a CC-BY 4.0 International license.

Finally, the preprint authors measured the output of these grants by analysing papers between 1986 and 2017 that cite NIH funding and measured the human MeSH scores and proportion that mention clinical trials. They could show that average human MeSH terms and the proportion of clinical trials for NIH funded publications from this period increased, whilst for non-NIH funded publications these metrics remained constant or decreased (fig. 3A and 3B).  By examining the funding announcements, they could show that this is almost entirely due to an increase in the number of solicited non-R01 calls funding these publications (fig. 3C and 3D).

Four graphs that compare the human MeSH terms and percentage of clinical related articles for the NIH vs non-NIH and within NIH for R01 and non-R01 calls

Figure 3: Publications from NIH grants have been increasingly clinically oriented. A and B) The average human MeSH scores and proportion of clinical trial publications for NIH funded vs non-NIH funded publications. C and D) The average human MeSH scores and proportion of clinical trial publications for NIH funded publications broken down by solicited/unsolicited and R01/non-R01 grant calls. Adapted from figures 5 and 6 of (Busse et al. 2025), made available under a CC-BY 4.0 International license.

Co-citation networks predict breakthroughs five years in advance

In the next preprint from Davis et al., the author team continued to explore what can be learned from data analysis of publications. More specifically, they used a machine learning approach to create a co-citation network of the 17.2 million papers in the PubMed database up to the year 2017. In this network, publications on the same topic cluster together and have greater density and shorter connections to clusters of a related topic (fig. 4A-C). Interestingly, they compared the similarity of papers in different journals or the same cluster and show that papers in disciplinary journals are more like each other than in inter-disciplinary journals, but that papers in the same cluster are more similar altogether (fig. 4D).

Depiction of the co-citation network showing how cluster neighbours are assigned by high betweenness edges, the resulting cluster neighbours for super resolution fluorescence microscopy. Finally how the co-citation network clusters compare to disciplinary or non disciplinary journals for organising the literature with similar publications.

Figure 4: The clustering and similarity of topics in the co-citation network. A) Topics in neighbouring clusters have a high density of connecting lines. B and C) The super-resolution fluorescence microscopy cluster (1) in the centre surrounded by related clusters (2-6) represented by colouring of the complete sub networks, or word clouds of the titles/abstracts from the publications. D) Pairs of papers are more semantically similar within clusters than within disciplinary journals (Genetics, Neuroscience, Blood) which are more similar than inter-disciplinary journals (Science, Nature, PNAS). Adapted from figures 1 and 2 of (Davis et al. 2025), made available under a CC-BY 4.0 International license.

The author team next applied their model to generate new networks that would be “current” for each year from 1981 to 2017 so that each network didn’t include “future” publications, e.g. the model for 2008 would not include papers in 2009 onwards. They then tracked clusters over time enabling them to produce cluster trajectories that split and converge (fig. 5). For each trajectory, they then tested which diagnostic clues could indicate that a major breakthrough (as inferred by an award such as a Nobel prize) was coming. They found that growth and cohesion of the literature, as well as the presence of highly influential papers (that are cited more than expected for their age or field) are all important factors. However, the greatest indicator was the percentage of new papers in the cluster compared to the previous years’ cluster. The team also noted that the breakthrough trajectories featured more merging events than other trajectories, suggesting that topics coming together is a factor that could lead to new breakthroughs.

Trajectories of four major breakthrough clusters, showing when a signal and breathrough occur, coloured by the %new publications with each year sized according to the number of publications. Split and emrge events can be seen in some cases.

Figure 5: Trajectories of four breakthrough clusters. The breakthrough is labelled with a blue asterisk and the highest % of new publications compared to the preceding clusters (the best signal indicator of a breakthrough) is labelled with a black asterisk.  Adapted from figure 3 (Davis et al. 2025), made available under a CC-BY 4.0 International license.

The co-citation network presented in the preprint used article level metrics, yet many organisations use journal level metrics to aid decision making. To test if this is a problem, Davies et al. took journal citation rates (JCR) as an approximation for impact factor and found that clusters which signal a breakthrough had an average JCR in the 79th to 97th percentile, meaning 135 control clusters had a better average JCR than the breakthroughs. In the author’s words: “journal level metrics would generate enough noise to make the signal of a future breakthrough difficult or impossible to detect”.

Finally, they tested whether their co-citation network and logistic regression models could predict future biomedical breakthroughs. To do this, they took two four-year windows and analysed which clusters matched the diagnostic signal of an impending breakthrough. Of the 18 clusters identified, only two have not been recognised with a major award and only one of those two is known to be a false positive (the other may yet receive recognition). The authors also plotted these clusters against their human MeSH scores (fig. 6) and the broad distribution of the clusters indicates that biomedical breakthroughs are not exclusively being made in more clinically oriented work.

Distribution of breakthrough clusters from two time periods according to their Human MeSH scores and % of new publications. Each cluster is coloured by the number of publications and sized by the fraction of NIH support.

Figure 6: Breakthrough signalling clusters from 1994-1997 and 2014-2017. Clusters are plotted according their human MeSH score and %New, coloured according to the number of publications and sized based on how many received NIH support. Adapted from figure 4 (Davis et al. 2025), made available under a CC-BY 4.0 International license.

Discussion

The two preprints discussed here tell a part of the story of how research has been funded over the past two decades and the published results that have come from it. They show research has become increasingly more clinical, perhaps reflecting a shift to the translation of fundamental research into something useful for society. Yet when examining breakthroughs, these remain broadly distributed and include fundamental research, although it is notable that there may have been a shift towards more clinical breakthroughs over the 20 years from 1994-1997 to 2014-2017 (fig. 6).

Analysis of publications as in Davis et al. (2025) offers fascinating insights into the academic community, who primarily communicate (show off) their research in academic journals to receive more funding or a promotion. Two other analyses of publication data caught my eye when writing this preLight. One shows that for early-career researchers, their postdoc publications may be more important to their future career than their PhD ones (Duan et al. 2025). Yet, this can be considered alongside other research showing that the influence of the PhD supervisor continues long after the student graduates and leaves the lab (Shibayama et al. 2024). The second analysis that caught my eye showed that the reproducibility of claims made by researchers using Drosophila as a model organism to study the immune system is higher than in other areas of research (Lemaitre et al. 2026). It could be interesting to combine this kind of analysis with the co-citation network to see if reproducibility within a cluster and cohesion correlate.

Whilst publications are an important output of a research project and lab group, it might also be wise to examine other outputs, which may be less tangible. What is the quality of scientific training in a group? Is research communicated to the wider community in other ways such as social media, outreach projects, participating in podcasts, interviews and traditional media? Does the group produce patents or spin out start-ups? How environmentally sustainable is the group? These are just some ideas of how research may be valuable beyond academic publications and could also perhaps inform future funding policies.

References

  • Busse, Brad L., James M. Tucker, Summer E. Allen, George M. Santengelo, and Kristine A. Willis. 2025. A Thirty-Year Trend of Increasing Clinical Orientation at the National Institutes of Health. https://doi.org/10.64898/2025.12.16.694423.
  • Davis, Matthew T., Brad L. Busse, Salsabil Arabi, et al. 2025. ‘Prediction of Transformative Breakthroughs in Biomedical Research’. Preprint, bioRxiv. https://doi.org/10.64898/2025.12.16.694385.
  • Duan, Yueran, Shahan Ali Memon, Bedoor AlShebli, Qing Guan, Petter Holme, and Talal Rahwan. 2025. ‘Postdoc Publications and Citations Link to Academic Retention and Faculty Success’. Proceedings of the National Academy of Sciences 122 (4): e2402053122. https://doi.org/10.1073/pnas.2402053122.
  • Lemaitre, Joseph, Désirée Popelka, Blandine Ribotta, et al. 2026. ‘A Retrospective Analysis of 400 Publications Reveals Patterns of Irreproducibility across an Entire Life Sciences Research Field’. Preprint, eLife, January 19. https://doi.org/10.7554/eLife.108403.1.
  • Shibayama, Sotaro, Pauline Mattsson, and Anders Broström. 2024. ‘Risk in Science: The Socialization of Risk-Taking in Early-Career Training’. Preprint. https://doi.org/10.2139/ssrn.4706997.

Tags: clinical, data science, funding, machine learning, nih, publication

(No Ratings Yet)

Have your say

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Sign up to customise the site to your preferences and to receive alerts

Register here

Also in the clinical trials category:

Microbial Feast or Famine: dietary carbohydrate composition and gut microbiota metabolic function

Blake Dirks, Alex E. Mohr, Karen D. Corbin, et al.

Selected by 11 December 2025

Jasmine Talevi

Microbiology

Identifiability-Guided Assessment of Digital Twins in Alzheimer’s Disease Clinical Research and Care

Juliet Jiang, Jeffrey R. Petrella, Wenrui Hao, et al.

Selected by 08 November 2025

My Nguyen

Neuroscience

MCL1 may not mediate chemoresistance

Kylin A. Emhoff, Kunho Chung, Dongmei Zhang, et al.

Selected by 10 September 2025

Kanishka Parashar

Cell Biology

Also in the scientific communication and education category:

A thirty-year trend of increasing clinical orientation at the National Institutes of Health

Brad L. Busse, James M. Tucker, Summer E. Allen, et al.

AND

Prediction of transformative breakthroughs in biomedical research

Matthew T. Davis, Brad L. Busse, Salsabil Arabi, et al.

Selected by 11 March 2026

Jonathan Townson

Scientific Communication and Education

DNA Specimen Preservation using DESS and DNA Extraction in Museum Collections: A Case Study Report

Eri Ogiso-Tanaka, Daisuke Shimada, Akito Ogawa, et al.

Selected by 17 February 2026

Daniel Fernando Reyes Enríquez, Marcus Oliveira

Zoology

Kosmos: An AI Scientist for Autonomous Discovery

Ludovico Mitchener, Angela Yiu, Benjamin Chang, et al.

Selected by 04 February 2026

Roberto Amadio et al.

Bioinformatics