Close

Science should be machine-readable

A. Sina Booeshaghi, Laura Luebbert, Lior Pachter

Posted on: 3 May 2026

Preprint posted on 2 February 2026

Science in the Age of Robots; The authors of this study use an AI-based tool for analysis of scientific manuscripts.

Selected by Theodora Stougiannou

Science in the Age of robots; The authors of this study use an AI-based tool for analysis of scientific manuscripts.

Peer review & Standardization of scientific reporting

The process of peer review, established in the 20th century, generally involves evaluation of research findings presented in a scientific manuscript by peer experts in the field. Though crucial for the improvement of scientific papers and associated results, as well as preservation of rigor in the scientific method, it is not without faults. Human peer review may often result in mistakes, and although it can positively affect the quality of published research, the public may be losing faith in the scientific enterprise as a whole [1]. 

Is standardizing the process of evaluating science using AI, the answer to these issues? The authors of this preprint create and use an LLM-based scientific information analysis tool to find out.

Benchmarking peer review & New models

Evaluating peer review is not straightforward for various reasons, including the absence of readily available peer review reports, the fact that these reports reflect the state of the manuscript during its submission and not publication, as well as use of restrictive licences (CC BY-NC-ND for example). The latter prohibits use of LLM for both the paper and its associated peer reviews [1].

To better facilitate peer review evaluation, eLife adjusted its publication system in early 2023; it uses the CC BY licence, allowing authors to publish peer review reports alongside their respective final manuscript versions. The particular method of publication has been described as the Publish, Review, Curate (PRC) model, considering submitted manuscripts as an ‘open conversation without the threat of rejection’ [2]. In this model, the variable representing the number of review rounds is often designated as ‘N and the variable representing the number of reviewers is designated ‘k’. The process of each PRC cycle thus generally entails N rounds of peer review, with each manuscript receiving reviews from k reviewers. During each cycle the authors are given an opportunity to respond and make revisions, while the journal publishes updates to draft. After certain rounds of review, the preprint is now considered a reviewed preprint, available via a GitHub repository in a machine-readable format (JATS XML) [1].

But what is JATS (Journal Article Tag Suite)?

JATS represents an XML format developed by the National Information Standard Organization, described as a standard way to annotate scientific articles. Similarly to HTML, it uses tags for the structural elements of the scientific manuscript, including its title, abstract, author list, various sections and figures. This allows for consistent use of stereotypical scientific vocabulary across different journals and preprint servers. Even JATS files, however, can contain more information than the LLM itself can parse. The authors of this preprint used a tool capable of parsing these XML files (jats), significantly reducing file size (~ 65%). Eventually, these papers were processed as JATS files and used as part of the study dataset (OpenEval Dataset) [1].

What are the authors of this preprint looking at?

This study looked at 2,487 eLife papers published via the PRC model along with their associated peer reviews, and 13,600 eLife papers appearing before 2023 (and, therefore, part of the ‘traditional’ journal model). The authors developed the OpenEval Benchmark, to evaluate, categorize and group information in these papers using an AI-based tool. The process in OpenEval Benchmark generally entails identification of a relevant paper along with its peer review reports; the paper is then subjected to steps contained within the OpenEval Benchmark process. These include claim extraction, initially, followed by grouping of identified claims according to which scientific result these claims support; claims are denoted with the variable, ‘C’, whilst results are denoted with the variable, ‘r’.

During this process, Cr sets are thus generated, while each result is further categorized as major or minor. In addition, each of these results are evaluated as either supported, unsupported, or uncertain. To evaluate the benchmark, the same papers are evaluated by human peer review; evaluations generated by OpenEval Benchmark are then compared against evaluations created by non-LLM human peer review, extracted from a peer review file [1].

Key aspects of the study; The OpenEval Benchmark  

A comparison between LLM and human review of scientific claims and results across the 2,487 manuscripts sourced from the eLife, generated and disseminated via the PRC model [1].

Why this work is interesting

Standardization of scientific output, whether it be experimental stages and protocols, or the format of the scientific text itself, is critical for the evaluation of the overall acquired scientific information. This is especially crucial today, where new scientific results are constantly generated and where the scientific endeavor branches into many different areas and sectors of life; ranging from physics, astronomy and astrobiology, engineering and materials science, to biology and biomedicine. The LLM-based tools discussed in this work need not replace human input, but rather better augment and organize it, in turn to better organize and understand the breadth of existing scientific knowledge and protocols.

Glossary

References

[1] Booeshaghi AS, Luebbert L, Pachter L. Science should be machine-readable 2026:2026.01.30.702911. https://doi.org/10.64898/2026.01.30.702911.

[2] Currie G. Open Science: What is publish, review, curate? eLife 2024. https://elifesciences.org/inside-elife/dc24a9cd/open-science-what-is-publish-review-curate (accessed April 1, 2026).

[3] Zhang Y, Chen Q. A Neural Span-Based Continual Named Entity Recognition Model. Proceedings of the AAAI Conference on Artificial Intelligence 2023;37:13993–4001. https://doi.org/10.1609/aaai.v37i11.26638.

[4] Wu S, He Y. Enriching Pre-trained Language Model with Entity Information for Relation Classification | Proceedings of the 28th ACM International Conference on Information and Knowledge Management. ACM Conferences 2019. https://dl.acm.org/doi/10.1145/3357384.3358119 (accessed April 18, 2026).

[5] GeeksforGeeks. Natural Language Processing (NLP) Tutorial. GeeksforGeeks 2026. https://www.geeksforgeeks.org/nlp/natural-language-processing-nlp-tutorial/ (accessed April 18, 2026).

[6] Metropolitansky D, Larson J. Towards Effective Extraction and Evaluation of Factual Claims. In: Che W, Nabende J, Shutova E, Pilehvar MT, editors. Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), Vienna, Austria: Association for Computational Linguistics; 2025, p. 6996–7045. https://doi.org/10.18653/v1/2025.acl-long.348.

[7] Michelakis E, Krishnamurthy R, Haas PJ, Vaithyanathan S. Uncertainty management in rule-based information extraction systems. Proceedings of the 2009 ACM SIGMOD International Conference on Management of data, New York, NY, USA: Association for Computing Machinery; 2009, p. 101–14.

https://doi.org/10.1145/1559845.1559858.

 

Read preprint (No Ratings Yet)

Have your say

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Sign up to customise the site to your preferences and to receive alerts

Register here

Also in the bioinformatics category:

Remote homology and functional genetics unmask deeply preserved Scm3/HJURP orthologs in metazoans

Jeremy A. Hollis, Jason A. Stonick, Irini Topalidou, et al.

Selected by 21 April 2026

Reinier Prosee

Evolutionary Biology

A potential anti-amyloidogenic therapy for type 2 diabetes based on the QBP1 peptide

María M. Tejero-Ojeda, Ada Bernaus Vives, Michal Wojciechowski, et al.

Selected by 01 April 2026

Joao Gabriel, Marcus Oliveira

Cell Biology

The lipidomic architecture of the mouse brain

Luca Fusar Bassini, Halima Hannah Schede, Laura Capolupo, et al.

Selected by 09 February 2026

CRM UoE Journal Club et al.

Neuroscience

Also in the cell biology category:

Science should be machine-readable

A. Sina Booeshaghi, Laura Luebbert, Lior Pachter

Selected by 03 May 2026

Theodora Stougiannou

Bioinformatics

Remote homology and functional genetics unmask deeply preserved Scm3/HJURP orthologs in metazoans

Jeremy A. Hollis, Jason A. Stonick, Irini Topalidou, et al.

Selected by 21 April 2026

Reinier Prosee

Evolutionary Biology

UFMylation of Pyruvate Dehydrogenase Regulates Mitochondrial Metabolism

Phong T. Nguyen, Zheng Wu, Dohun Kim, et al.

Selected by 20 April 2026

Hannah Pletcher

Biochemistry

Also in the scientific communication and education category:

Science should be machine-readable

A. Sina Booeshaghi, Laura Luebbert, Lior Pachter

Selected by 03 May 2026

Theodora Stougiannou

Bioinformatics

A thirty-year trend of increasing clinical orientation at the National Institutes of Health

Brad L. Busse, James M. Tucker, Summer E. Allen, et al.

AND

Prediction of transformative breakthroughs in biomedical research

Matthew T. Davis, Brad L. Busse, Salsabil Arabi, et al.

Selected by 11 March 2026

Jonathan Townson

Scientific Communication and Education

DNA Specimen Preservation using DESS and DNA Extraction in Museum Collections: A Case Study Report

Eri Ogiso-Tanaka, Daisuke Shimada, Akito Ogawa, et al.

Selected by 17 February 2026

Daniel Fernando Reyes Enríquez, Marcus Oliveira

Zoology

preLists in the bioinformatics category:

Keystone Symposium – Metabolic and Nutritional Control of Development and Cell Fate

This preList contains preprints discussed during the Metabolic and Nutritional Control of Development and Cell Fate Keystone Symposia. This conference was organized by Lydia Finley and Ralph J. DeBerardinis and held in the Wylie Center and Tupper Manor at Endicott College, Beverly, MA, United States from May 7th to 9th 2025. This meeting marked the first in-person gathering of leading researchers exploring how metabolism influences development, including processes like cell fate, tissue patterning, and organ function, through nutrient availability and metabolic regulation. By integrating modern metabolic tools with genetic and epidemiological insights across model organisms, this event highlighted key mechanisms and identified open questions to advance the emerging field of developmental metabolism.

 



List by Virginia Savy, Martin Estermann

‘In preprints’ from Development 2022-2023

A list of the preprints featured in Development's 'In preprints' articles between 2022-2023

 



List by Alex Eve, Katherine Brown

9th International Symposium on the Biology of Vertebrate Sex Determination

This preList contains preprints discussed during the 9th International Symposium on the Biology of Vertebrate Sex Determination. This conference was held in Kona, Hawaii from April 17th to 21st 2023.

 



List by Martin Estermann

Alumni picks – preLights 5th Birthday

This preList contains preprints that were picked and highlighted by preLights Alumni - an initiative that was set up to mark preLights 5th birthday. More entries will follow throughout February and March 2023.

 



List by Sergio Menchero et al.

Fibroblasts

The advances in fibroblast biology preList explores the recent discoveries and preprints of the fibroblast world. Get ready to immerse yourself with this list created for fibroblasts aficionados and lovers, and beyond. Here, my goal is to include preprints of fibroblast biology, heterogeneity, fate, extracellular matrix, behavior, topography, single-cell atlases, spatial transcriptomics, and their matrix!

 



List by Osvaldo Contreras

Single Cell Biology 2020

A list of preprints mentioned at the Wellcome Genome Campus Single Cell Biology 2020 meeting.

 



List by Alex Eve

Antimicrobials: Discovery, clinical use, and development of resistance

Preprints that describe the discovery of new antimicrobials and any improvements made regarding their clinical use. Includes preprints that detail the factors affecting antimicrobial selection and the development of antimicrobial resistance.

 



List by Zhang-He Goh

Also in the cell biology category:

BSDB Spring Meeting: Molecules to Morphogenesis

The British Society for Developmental Biology (BSDB) Spring Meeting Molecules to Morphogenesis was held from 23–26 March 2026 at the University of Warwick (UK). This meeting brought together a vibrant community of researchers to discuss how molecular mechanisms are integrated across scales to drive morphogenesis, spanning diverse model systems and approaches. This preList contains preprints by presenters from the talk and poster sessions at the meeting. Please do get in touch at preLights@biologists.com if you notice any relevant preprints that we may have missed.

 



List by Ingrid Tsang

Keystone Symposium on Stem Cell Models in Embryology 2026

The Keystone Symposium on Stem Cell Models in Embryology, 2026, was organised by Jun Wu (UT Southwestern), Jianping Fu (University of Michigan) and Miki Ebisuya (TU Dresden) and held at Asilomar Conference Grounds in California (US). The meeting discussed recent advances made in establishing stem-cell-based embryo models, what fundamental insights into developmental processes have been gleaned from them, as well as how they are beginning to be applied more widely. This prelist contains preprints by presenters at the talk and poster sessions at the conference, which our Reviews Editor in attendance spotted. Please do reach out to preLights@biologists.com if you notice any that we’ve missed.

 



List by Ingrid Tsang

SciELO preprints – From 2025 onwards

SciELO has become a cornerstone of open, multilingual scholarly communication across Latin America. Its preprint server, SciELO preprints, is expanding the global reach of preprinted research from the region (for more information, see our interview with Carolina Tanigushi). This preList brings together biological, English language SciELO preprints to help readers discover emerging work from the Global South. By highlighting these preprints in one place, we aim to support visibility, encourage early feedback, and showcase the vibrant research communities contributing to SciELO’s open science ecosystem.

 



List by Carolina Tanigushi

November in preprints – DevBio & Stem cell biology

preLighters with expertise across developmental and stem cell biology have nominated a few developmental and stem cell biology (and related) preprints posted in November they’re excited about and explain in a single paragraph why. Concise preprint highlights, prepared by the preLighter community – a quick way to spot upcoming trends, new methods and fresh ideas.

 



List by Aline Grata et al.

October in preprints – DevBio & Stem cell biology

Each month, preLighters with expertise across developmental and stem cell biology nominate a few recent developmental and stem cell biology (and related) preprints they’re excited about and explain in a single paragraph why. Short, snappy picks from working scientists — a quick way to spot fresh ideas, bold methods and papers worth reading in full. These preprints can all be found in the October preprint list published on the Node.

 



List by Deevitha Balasubramanian et al.

October in preprints – Cell biology edition

Different preLighters, with expertise across cell biology, have worked together to create this preprint reading list for researchers with an interest in cell biology. This month, most picks fall under (1) Cell organelles and organisation, followed by (2) Mechanosignaling and mechanotransduction, (3) Cell cycle and division and (4) Cell migration

 



List by Matthew Davies et al.

September in preprints – Cell biology edition

A group of preLighters, with expertise in different areas of cell biology, have worked together to create this preprint reading list. This month, categories include: (1) Cell organelles and organisation, (2) Cell signalling and mechanosensing, (3) Cell metabolism, (4) Cell cycle and division, (5) Cell migration

 



List by Sristilekha Nath et al.

July in preprints – the CellBio edition

A group of preLighters, with expertise in different areas of cell biology, have worked together to create this preprint reading lists for researchers with an interest in cell biology. This month, categories include: (1) Cell Signalling and Mechanosensing (2) Cell Cycle and Division (3) Cell Migration and Cytoskeleton (4) Cancer Biology (5) Cell Organelles and Organisation

 



List by Girish Kale et al.

June in preprints – the CellBio edition

A group of preLighters, with expertise in different areas of cell biology, have worked together to create this preprint reading lists for researchers with an interest in cell biology. This month, categories include: (1) Cell organelles and organisation (2) Cell signaling and mechanosensation (3) Genetics/gene expression (4) Biochemistry (5) Cytoskeleton

 



List by Barbora Knotkova et al.

May in preprints – the CellBio edition

A group of preLighters, with expertise in different areas of cell biology, have worked together to create this preprint reading lists for researchers with an interest in cell biology. This month, categories include: 1) Biochemistry/metabolism 2) Cancer cell Biology 3) Cell adhesion, migration and cytoskeleton 4) Cell organelles and organisation 5) Cell signalling and 6) Genetics

 



List by Barbora Knotkova et al.

Keystone Symposium – Metabolic and Nutritional Control of Development and Cell Fate

This preList contains preprints discussed during the Metabolic and Nutritional Control of Development and Cell Fate Keystone Symposia. This conference was organized by Lydia Finley and Ralph J. DeBerardinis and held in the Wylie Center and Tupper Manor at Endicott College, Beverly, MA, United States from May 7th to 9th 2025. This meeting marked the first in-person gathering of leading researchers exploring how metabolism influences development, including processes like cell fate, tissue patterning, and organ function, through nutrient availability and metabolic regulation. By integrating modern metabolic tools with genetic and epidemiological insights across model organisms, this event highlighted key mechanisms and identified open questions to advance the emerging field of developmental metabolism.

 



List by Virginia Savy, Martin Estermann

April in preprints – the CellBio edition

A group of preLighters, with expertise in different areas of cell biology, have worked together to create this preprint reading lists for researchers with an interest in cell biology. This month, categories include: 1) biochemistry/metabolism 2) cell cycle and division 3) cell organelles and organisation 4) cell signalling and mechanosensing 5) (epi)genetics

 



List by Vibha SINGH et al.

March in preprints – the CellBio edition

A group of preLighters, with expertise in different areas of cell biology, have worked together to create this preprint reading lists for researchers with an interest in cell biology. This month, categories include: 1) cancer biology 2) cell migration 3) cell organelles and organisation 4) cell signalling and mechanosensing 5) genetics and genomics 6) other

 



List by Girish Kale et al.

Biologists @ 100 conference preList

This preList aims to capture all preprints being discussed at the Biologists @100 conference in Liverpool, UK, either as part of the poster sessions or the (flash/short/full-length) talks.

 



List by Reinier Prosee, Jonathan Townson

February in preprints – the CellBio edition

A group of preLighters, with expertise in different areas of cell biology, have worked together to create this preprint reading lists for researchers with an interest in cell biology. This month, categories include: 1) biochemistry and cell metabolism 2) cell organelles and organisation 3) cell signalling, migration and mechanosensing

 



List by Barbora Knotkova et al.

Community-driven preList – Immunology

In this community-driven preList, a group of preLighters, with expertise in different areas of immunology have worked together to create this preprint reading list.

 



List by Felipe Del Valle Batalla et al.

January in preprints – the CellBio edition

A group of preLighters, with expertise in different areas of cell biology, have worked together to create this preprint reading lists for researchers with an interest in cell biology. This month, categories include: 1) biochemistry/metabolism 2) cell migration 3) cell organelles and organisation 4) cell signalling and mechanosensing 5) genetics/gene expression

 



List by Barbora Knotkova et al.

December in preprints – the CellBio edition

A group of preLighters, with expertise in different areas of cell biology, have worked together to create this preprint reading lists for researchers with an interest in cell biology. This month, categories include: 1) cell cycle and division 2) cell migration and cytoskeleton 3) cell organelles and organisation 4) cell signalling and mechanosensing 5) genetics/gene expression

 



List by Matthew Davies et al.

November in preprints – the CellBio edition

This is the first community-driven preList! A group of preLighters, with expertise in different areas of cell biology, have worked together to create this preprint reading lists for researchers with an interest in cell biology. Categories include: 1) cancer cell biology 2) cell cycle and division 3) cell migration and cytoskeleton 4) cell organelles and organisation 5) cell signalling and mechanosensing 6) genetics/gene expression

 



List by Felipe Del Valle Batalla et al.

BSCB-Biochemical Society 2024 Cell Migration meeting

This preList features preprints that were discussed and presented during the BSCB-Biochemical Society 2024 Cell Migration meeting in Birmingham, UK in April 2024. Kindly put together by Sara Morais da Silva, Reviews Editor at Journal of Cell Science.

 



List by Reinier Prosee

‘In preprints’ from Development 2022-2023

A list of the preprints featured in Development's 'In preprints' articles between 2022-2023

 



List by Alex Eve, Katherine Brown

preLights peer support – preprints of interest

This is a preprint repository to organise the preprints and preLights covered through the 'preLights peer support' initiative.

 



List by preLights peer support

The Society for Developmental Biology 82nd Annual Meeting

This preList is made up of the preprints discussed during the Society for Developmental Biology 82nd Annual Meeting that took place in Chicago in July 2023.

 



List by Joyce Yu, Katherine Brown

CSHL 87th Symposium: Stem Cells

Preprints mentioned by speakers at the #CSHLsymp23

 



List by Alex Eve

Journal of Cell Science meeting ‘Imaging Cell Dynamics’

This preList highlights the preprints discussed at the JCS meeting 'Imaging Cell Dynamics'. The meeting was held from 14 - 17 May 2023 in Lisbon, Portugal and was organised by Erika Holzbaur, Jennifer Lippincott-Schwartz, Rob Parton and Michael Way.

 



List by Helen Zenner

9th International Symposium on the Biology of Vertebrate Sex Determination

This preList contains preprints discussed during the 9th International Symposium on the Biology of Vertebrate Sex Determination. This conference was held in Kona, Hawaii from April 17th to 21st 2023.

 



List by Martin Estermann

Alumni picks – preLights 5th Birthday

This preList contains preprints that were picked and highlighted by preLights Alumni - an initiative that was set up to mark preLights 5th birthday. More entries will follow throughout February and March 2023.

 



List by Sergio Menchero et al.

CellBio 2022 – An ASCB/EMBO Meeting

This preLists features preprints that were discussed and presented during the CellBio 2022 meeting in Washington, DC in December 2022.

 



List by Nadja Hümpfer et al.

Fibroblasts

The advances in fibroblast biology preList explores the recent discoveries and preprints of the fibroblast world. Get ready to immerse yourself with this list created for fibroblasts aficionados and lovers, and beyond. Here, my goal is to include preprints of fibroblast biology, heterogeneity, fate, extracellular matrix, behavior, topography, single-cell atlases, spatial transcriptomics, and their matrix!

 



List by Osvaldo Contreras

EMBL Synthetic Morphogenesis: From Gene Circuits to Tissue Architecture (2021)

A list of preprints mentioned at the #EESmorphoG virtual meeting in 2021.

 



List by Alex Eve

FENS 2020

A collection of preprints presented during the virtual meeting of the Federation of European Neuroscience Societies (FENS) in 2020

 



List by Ana Dorrego-Rivas

Planar Cell Polarity – PCP

This preList contains preprints about the latest findings on Planar Cell Polarity (PCP) in various model organisms at the molecular, cellular and tissue levels.

 



List by Ana Dorrego-Rivas

BioMalPar XVI: Biology and Pathology of the Malaria Parasite

[under construction] Preprints presented at the (fully virtual) EMBL BioMalPar XVI, 17-18 May 2020 #emblmalaria

 



List by Dey Lab, Samantha Seah

1

Cell Polarity

Recent research from the field of cell polarity is summarized in this list of preprints. It comprises of studies focusing on various forms of cell polarity ranging from epithelial polarity, planar cell polarity to front-to-rear polarity.

 



List by Yamini Ravichandran

TAGC 2020

Preprints recently presented at the virtual Allied Genetics Conference, April 22-26, 2020. #TAGC20

 



List by Maiko Kitaoka et al.

3D Gastruloids

A curated list of preprints related to Gastruloids (in vitro models of early development obtained by 3D aggregation of embryonic cells). Updated until July 2021.

 



List by Paul Gerald L. Sanchez and Stefano Vianello

ECFG15 – Fungal biology

Preprints presented at 15th European Conference on Fungal Genetics 17-20 February 2020 Rome

 



List by Hiral Shah

ASCB EMBO Annual Meeting 2019

A collection of preprints presented at the 2019 ASCB EMBO Meeting in Washington, DC (December 7-11)

 



List by Madhuja Samaddar et al.

EMBL Seeing is Believing – Imaging the Molecular Processes of Life

Preprints discussed at the 2019 edition of Seeing is Believing, at EMBL Heidelberg from the 9th-12th October 2019

 



List by Dey Lab

Autophagy

Preprints on autophagy and lysosomal degradation and its role in neurodegeneration and disease. Includes molecular mechanisms, upstream signalling and regulation as well as studies on pharmaceutical interventions to upregulate the process.

 



List by Sandra Malmgren Hill

Lung Disease and Regeneration

This preprint list compiles highlights from the field of lung biology.

 



List by Rob Hynds

Cellular metabolism

A curated list of preprints related to cellular metabolism at Biorxiv by Pablo Ranea Robles from the Prelights community. Special interest on lipid metabolism, peroxisomes and mitochondria.

 



List by Pablo Ranea Robles

BSCB/BSDB Annual Meeting 2019

Preprints presented at the BSCB/BSDB Annual Meeting 2019

 



List by Dey Lab

MitoList

This list of preprints is focused on work expanding our knowledge on mitochondria in any organism, tissue or cell type, from the normal biology to the pathology.

 



List by Sandra Franco Iborra

Biophysical Society Annual Meeting 2019

Few of the preprints that were discussed in the recent BPS annual meeting at Baltimore, USA

 



List by Joseph Jose Thottacherry

ASCB/EMBO Annual Meeting 2018

This list relates to preprints that were discussed at the recent ASCB conference.

 



List by Dey Lab, Amanda Haage