Aging is associated with a systemic length-driven transcriptome imbalance

Thomas Stoeger, Rogan A. Grant, Alexandra C. McQuattie-Pimentel, Kishore Anekalla, Sophia S. Liu, Heliodoro Tejedor-Navarro, Benjamin D. Singer, Hiam Abdala-Valencia, Michael Schwake, Marie-Pier Tetreault, Harris Perlman, William E Balch, Navdeep Chandel, Karen Ridge, Jacob I. Sznajder, Richard I. Morimoto, Alexander V. Misharin, G.R. Scott Budinger, Luis A. Nunes Amaral

Preprint posted on July 03, 2019

Machine learning predicts markers of ageing: A systems biology approach points to an imbalance in RNA transcript length as the best single factor implicated in ageing, more significant than any gene alteration alone.

Selected by Monika S. Magon


Ageing can be described as a complex progressive change in the homeostasis of molecular, cellular and physiological systems. Given the heterogeneity of its phenotypical and cellular manifestations, a wide range of studies have tried to identify molecular causes of the defects associated with ageing – with the hope that these will be druggable targets. Ageing has several hallmarks, including proteostasis, genomic instability, stem cell exhaustion, or metabolic problems1. However, high inter-species, inter-individual and even inter-organ or cell type variability, as well as little consistency between different study designs, have so far halted major breakthroughs in finding molecular mechanisms of ageing. While the search for genetic markers of ageing has found many genes dysregulated, these gene expression changes are subtle and therefore do not provide informative markers for ageing2. In this study, the authors aimed to identify an objective marker by going beyond gene expression and looking into the global transcript architecture and regulatory changes. To do so, they employed modern high throughput methods of unsupervised machine learning to analyse the transcriptome of mice as well as perform metanalysis on existing data from other species, including humans. The cutting-edge machine learning approaches allowed the authors to pinpoint a single best estimator of ageing – the balance of long and short transcripts.


Key findings

The results derived from the machine learning based investigation of mice transcriptomes demonstrate that the abundance of longer transcripts is decreased while that of the shorter ones’ is increased in ageing tissues (e.g. Figure 1 in this preprint3).

This is confirmed with meta-analysis of other studies in mice, as well as killifish, rat and human transcriptomes. The source of this homeostatic imbalance of transcript length is unknown but might be related to transcriptional regulation, posttranscriptional processing or other aspects of RNA biology. The authors point towards SFPQ, a gene that expresses a protein involved in transcriptional elongation (amongst its other functions)4. Interestingly, a meta-analysis of environmental factors known to contribute to the ageing phenotype, such as exposure to pollution, heat, pathogens, sleep deprivation, as well as the occurrence of neurodegenerative diseases, can be correlated with transcriptome imbalance. Moreover, the experimental analysis of mice of different age subjected to Influenza A, a common risk factor in ageing, caused an increased imbalance in transcript length abundance, especially in the oldest mice.

Thanks to the rigorous evaluation of the machine learning sensitivity and accuracy of predictions, as well as the use of large datasets, the authors could confirm that the imbalance in the abundance of the shortest and longest transcripts can be correlated with biological processes involved in ageing. They show that the abundance of long and short transcripts (5% of extreme lengths) arises from genes relevant to ageing, in processes such as proteostasis, chromatin organisation, mitochondrial function or neuronal activity.


Opinion and questions

In modern times, the population is ageing at an unprecedented rate. According to the World Health Organization, by 2020 the number of people aged 60 years and older will outnumber children younger than 5 years5. I believe that this manuscript will open new research avenues in ageing and stimulate a broader impact on our society in the future.

In this sense, this work presents an integrative and interdisciplinary approach, which could bring more systems biology approaches to the research of ageing. I chose to highlight this preprint also due to its relevance to a wider area of science. One could think that perhaps, in the same manner, the transcriptome length imbalance might be involved in the control of development. Besides, the transcriptional architecture might show its importance in neurodegenerative diseases, which are often seen as accelerated ageing processes6.

Moreover, as data science, machine learning and artificial intelligence advance, we will see more and more of these approaches in basic and applied science. This study sets an excellent example of the not so well adapted yet approaches in modern biology.

What comes next? While transcript length imbalance has been studied quite extensively in this preprint, it would be great to see the studies on the other identified factors associated with ageing (e.g. number of transcription factors), perhaps in a separate study.



  1. López-Otín, C., et al. The Hallmarks of Aging. Cell 153, 1194–1217 (2013).
  2. Cellerino, & Ori, A. What have we learned on aging from omics studies? Semin. Cell Dev. Biol. 70, 177–189 (2017).
  3. Stoeger, T., et al. Aging is associated with a systemic length-driven transcriptome imbalance. bioRxiv 691154 (2019).
  4. Takeuchi, et al. Loss of SFPQ Causes Long-Gene Transcriptopathy in the Brain. Cell Reports 23, 1326–1341 (2018).
  6. Liu, Y., Cali, C. P. & Lee, E. B. RNA metabolism in neurodegenerative disease. Dis. Model. Mech. 10, 509–518 (2017).

Tags: aging, machine learning, systems biology, transcipt length

Posted on: 4th September 2019 , updated on: 5th September 2019

Read preprint (No Ratings Yet)

  • Have your say

    Your email address will not be published. Required fields are marked *

    This site uses Akismet to reduce spam. Learn how your comment data is processed.

    Sign up to customise the site to your preferences and to receive alerts

    Register here

    preLists in the cell biology category: