Accurate detection of m6A RNA modifications in native RNA sequences
Posted on: 2 April 2019
Preprint posted on 21 January 2019
Article now published in Nature Communications at http://dx.doi.org/10.1038/s41467-019-11713-9
Hidden in plain sight: A machine learning approach uses sequencing errors to identify native RNA modifications in nanopore sequencing.
Selected by Christian BatesCategories: bioinformatics, genomics, molecular biology
Background
The RNA content of the cell encodes a vast amount of information. Aside from simply encoding the amino-acid sequence required to build a protein, portions of RNA are capable of regulating cellular processes such as splicing, mRNA translation and chromosome inactivation. In recent years, it has emerged that RNA can be chemically modified in numerous different ways. All of these RNA modifications in the cell have been termed the epitranscriptome (‘above transcriptome’) as they do not alter the nucleotide sequence of the biomolecule, but they can impact its ability to function in many of the processes outlined above, providing an additional layer of regulatory information [1].
Whilst modifications to species of RNA such as rRNA and tRNA have been long known – as early as the 1960s [2,3] – their impact on mRNA has recently been appreciated. One such modification is the methylation of a nitrogen atom in adenine, generating m6A. This modification is widespread in higher eukaryotes and can have a profound impact on mRNA stability and translation. However, despite the ubiquity of this modification, studying m6A modifications at the transcriptome-wide level has been particularly challenging because m6A modifications do not impact Watson-Crick base pairing. As such, m6A modifications cannot be identified via reverse-transcription methods, which are routinely used to assess other RNA modifications. Therefore, current epitranscriptomic studies interrogating m6A infer its presence through the use of antibodies. These effectively enrich RNA with a specific modification, which are then identified via high-throughput sequencing platforms.
This preprint from Liu et al aims to identify m6A modifications directly using an emerging sequencing platform provided by Oxford Nanopore Technologies (ONT). This platform directly sequences nucleic acids, unlike existing next-gen technologies, such as those provided by Illumina, which sequence DNA and RNA by synthesis. Specifically, ONT consists of thousands of individual polymer membranes, each with a single nanopore embedded within them. Nucleic acid is captured by these nanopores and ratcheted through the membrane. Due to the electrochemical potential of each base, this ratcheting perturbs the current between the two sides of the membrane. As each base is chemically different, they generate an idiosyncratic perturbation in the current, which can be deconvoluted by a recurrent neural network, converting this signal into a sequence of bases (Figure 1).
Key Findings
RNA modifications exert their effects within the cell through their ability to provide unique physico-chemical properties to the modified nucleotide. This unique ‘signature’ can then be read by specific enzymes, subsequently dictating the fate of the modified RNA [4]. With this in mind, Liu, et al. begin with the hypothesis that, due to their unique chemical signature, modified nucleotides will also generate unique current intensity change as the RNA is ratcheted through the pore. As a consequence, the authors suggest that if a base is modified, it may be less likely to be assigned correctly, particularly if the program used to decipher the nucleotide sequence has not been trained to look for modifications.
To test this hypothesis, Liu, et al. sequenced two versions of the same oligo: one which contained unmodified adenine, and another in which all instances of adenine were substituted for m6A. This showed that, as they had predicted, m6A-modified reads contained more errors, and that these errors were primarily found at adenine nucleotides. Importantly, these errors were reproduced across several repeats, suggesting that they are not random errors, but rather that they are the result of some underlying feature of the RNA at that position.
Next, they went on to prove that these errors are sufficient to determine whether a nucleotide was modified or not, by testing whether modified RRACH (R = G/A; H = A/C/U) oligos are sufficiently different to non-modified RRACH oligos. This is important as RRACH is the most common m6A motif; in some cell types, more than 85% of m6A sites were found to occur at this motif [5]. Not only were modified RRACH motifs different from unmodified motifs, but by using a combination of error features of the modified adenine, such as the confidence of the base call and the frequency of errors at that base, a machine learning approach was capable of predicting whether a motif was modified with 91% accuracy.
Together, these experiments demonstrate a proof-of-principle that ONT sequencing platforms could be used to identify nucleotide modifications on RNA directly. The capacity to do so would greatly enhance the accuracy and resolution of existing technologies.
Why I chose this pre-print
The epitranscriptome represents an important facet of the regulation of gene expression. Yet, whilst it has been possible in the past to study m6A modifications at the genome-wide level, previous approaches rely upon the use of antibodies, making them expensive and limited in resolution. This pre-print demonstrates the capability of calling m6A modifications directly, with single nucleotide resolution using an emerging sequencing platform, provided by ONT.
I particularly like the fact that the authors have made their data and code available on public repositories. This allows quick dissemination of their work and also enables other groups to test whether alternative machine learning approaches may call m6A sites with higher accuracy.
Future Directions and Questions
This work highlights the ability to predict modified nucleotides on a synthetic RNA sequence, with either all or no adenines possessing modifications. As such, it would be interesting to see whether this approach could call differences on non-uniformly modified RNA extracted directly from cells. I also think it would be interesting to test whether the characteristic differences in base quality and error frequency could be used to specifically differentiate between m6A modifications, and other adenine modifications such as m1A, or whether these errors can only be used to infer that some modification has occurred at a specific base.
References
1 Roundtree IA, Evans ME, Pan T, He C. (2017) Dynamic RNA Modifications in Gene Expression Regulation. Cell; 169: 1187–1200. doi:10.1016/j.cell.2017.05.045.
2 Cohn WE. (1960) Pseudouridine, a carbon-carbon linked ribonucleoside in ribonucleic acids: isolation, structure, and chemical characteristics. J Biol Chem; 235: 1488–1498.
3 Holley RW, Everett GA, Madison JT, Zamir A. (1965) Nucleotide Sequences in the Yeast Alanine Transfer Ribonucleic Acid. J Biol Chem ; 240: 2122–2128.
4 Sloan KE, Warda AS, Sharma S, Entian KD, Lafontaine DLJ, Bohnsack MT. (2017) Tuning the ribosome: The influence of rRNA modification on eukaryotic ribosome biogenesis and function. RNA Biol; 14: 1138–1152. doi:10.1080/15476286.2016.1259781.
5 Chen T, Hao Y-J, Zhang Y, Li M-M, Wang M, Han W et al. (2015) m6A RNA Methylation Is Regulated by MicroRNAs and Promotes Reprogramming to Pluripotency. Cell Stem Cell; 16: 289–301. doi:10.1016/j.stem.2015.01.016.
doi: https://doi.org/10.1242/prelights.9716
Read preprintSign up to customise the site to your preferences and to receive alerts
Register hereAlso in the bioinformatics category:
IMMClock reveals immune aging and T cell function at single-cell resolution
Jessica Chevallier
Adenine DNA methylation associated to transcription is widespread across eukaryotes
Francisco Falcon
Deep learning-based predictions of gene perturbation effects do not yet outperform simple linear methods
Benjamin Dominik Maier
Also in the genomics category:
IMMClock reveals immune aging and T cell function at single-cell resolution
Jessica Chevallier
Adenine DNA methylation associated to transcription is widespread across eukaryotes
Francisco Falcon
A fine kinetic balance of interactions directs transcription factor hubs to genes
Deevitha Balasubramanian
Also in the molecular biology category:
Levetiracetam prevents Aβ42 production through SV2a-dependent modulation of App processing in Alzheimer’s disease models
Jawdat Sandakly
Chromosomal instability in human trophoblast stem cells and placentas
Carly Guiltinan
Germplasm stability in zebrafish requires maternal Tdrd6a and Tdrd6c
Justin Gutkowski
preListsbioinformatics category:
in the‘In preprints’ from Development 2022-2023
A list of the preprints featured in Development's 'In preprints' articles between 2022-2023
List by | Alex Eve, Katherine Brown |
9th International Symposium on the Biology of Vertebrate Sex Determination
This preList contains preprints discussed during the 9th International Symposium on the Biology of Vertebrate Sex Determination. This conference was held in Kona, Hawaii from April 17th to 21st 2023.
List by | Martin Estermann |
Alumni picks – preLights 5th Birthday
This preList contains preprints that were picked and highlighted by preLights Alumni - an initiative that was set up to mark preLights 5th birthday. More entries will follow throughout February and March 2023.
List by | Sergio Menchero et al. |
Fibroblasts
The advances in fibroblast biology preList explores the recent discoveries and preprints of the fibroblast world. Get ready to immerse yourself with this list created for fibroblasts aficionados and lovers, and beyond. Here, my goal is to include preprints of fibroblast biology, heterogeneity, fate, extracellular matrix, behavior, topography, single-cell atlases, spatial transcriptomics, and their matrix!
List by | Osvaldo Contreras |
Single Cell Biology 2020
A list of preprints mentioned at the Wellcome Genome Campus Single Cell Biology 2020 meeting.
List by | Alex Eve |
Antimicrobials: Discovery, clinical use, and development of resistance
Preprints that describe the discovery of new antimicrobials and any improvements made regarding their clinical use. Includes preprints that detail the factors affecting antimicrobial selection and the development of antimicrobial resistance.
List by | Zhang-He Goh |
Also in the genomics category:
End-of-year preprints – the genetics & genomics edition
In this community-driven preList, a group of preLighters, with expertise in different areas of genetics and genomics have worked together to create this preprint reading list. Categories include: 1) genomics 2) bioinformatics 3) gene regulation 4) epigenetics
List by | Chee Kiang Ewe et al. |
BSCB-Biochemical Society 2024 Cell Migration meeting
This preList features preprints that were discussed and presented during the BSCB-Biochemical Society 2024 Cell Migration meeting in Birmingham, UK in April 2024. Kindly put together by Sara Morais da Silva, Reviews Editor at Journal of Cell Science.
List by | Reinier Prosee |
9th International Symposium on the Biology of Vertebrate Sex Determination
This preList contains preprints discussed during the 9th International Symposium on the Biology of Vertebrate Sex Determination. This conference was held in Kona, Hawaii from April 17th to 21st 2023.
List by | Martin Estermann |
Semmelweis Symposium 2022: 40th anniversary of international medical education at Semmelweis University
This preList contains preprints discussed during the 'Semmelweis Symposium 2022' (7-9 November), organised around the 40th anniversary of international medical education at Semmelweis University covering a wide range of topics.
List by | Nándor Lipták |
20th “Genetics Workshops in Hungary”, Szeged (25th, September)
In this annual conference, Hungarian geneticists, biochemists and biotechnologists presented their works. Link: http://group.szbk.u-szeged.hu/minikonf/archive/prg2021.pdf
List by | Nándor Lipták |
EMBL Conference: From functional genomics to systems biology
Preprints presented at the virtual EMBL conference "from functional genomics and systems biology", 16-19 November 2020
List by | Jesus Victorino |
TAGC 2020
Preprints recently presented at the virtual Allied Genetics Conference, April 22-26, 2020. #TAGC20
List by | Maiko Kitaoka et al. |
Zebrafish immunology
A compilation of cutting-edge research that uses the zebrafish as a model system to elucidate novel immunological mechanisms in health and disease.
List by | Shikha Nayar |
Also in the molecular biology category:
2024 Hypothalamus GRC
This 2024 Hypothalamus GRC (Gordon Research Conference) preList offers an overview of cutting-edge research focused on the hypothalamus, a critical brain region involved in regulating homeostasis, behavior, and neuroendocrine functions. The studies included cover a range of topics, including neural circuits, molecular mechanisms, and the role of the hypothalamus in health and disease. This collection highlights some of the latest advances in understanding hypothalamic function, with potential implications for treating disorders such as obesity, stress, and metabolic diseases.
List by | Nathalie Krauth |
BSCB-Biochemical Society 2024 Cell Migration meeting
This preList features preprints that were discussed and presented during the BSCB-Biochemical Society 2024 Cell Migration meeting in Birmingham, UK in April 2024. Kindly put together by Sara Morais da Silva, Reviews Editor at Journal of Cell Science.
List by | Reinier Prosee |
‘In preprints’ from Development 2022-2023
A list of the preprints featured in Development's 'In preprints' articles between 2022-2023
List by | Alex Eve, Katherine Brown |
CSHL 87th Symposium: Stem Cells
Preprints mentioned by speakers at the #CSHLsymp23
List by | Alex Eve |
9th International Symposium on the Biology of Vertebrate Sex Determination
This preList contains preprints discussed during the 9th International Symposium on the Biology of Vertebrate Sex Determination. This conference was held in Kona, Hawaii from April 17th to 21st 2023.
List by | Martin Estermann |
Alumni picks – preLights 5th Birthday
This preList contains preprints that were picked and highlighted by preLights Alumni - an initiative that was set up to mark preLights 5th birthday. More entries will follow throughout February and March 2023.
List by | Sergio Menchero et al. |
CellBio 2022 – An ASCB/EMBO Meeting
This preLists features preprints that were discussed and presented during the CellBio 2022 meeting in Washington, DC in December 2022.
List by | Nadja Hümpfer et al. |
EMBL Synthetic Morphogenesis: From Gene Circuits to Tissue Architecture (2021)
A list of preprints mentioned at the #EESmorphoG virtual meeting in 2021.
List by | Alex Eve |
FENS 2020
A collection of preprints presented during the virtual meeting of the Federation of European Neuroscience Societies (FENS) in 2020
List by | Ana Dorrego-Rivas |
ECFG15 – Fungal biology
Preprints presented at 15th European Conference on Fungal Genetics 17-20 February 2020 Rome
List by | Hiral Shah |
ASCB EMBO Annual Meeting 2019
A collection of preprints presented at the 2019 ASCB EMBO Meeting in Washington, DC (December 7-11)
List by | Madhuja Samaddar et al. |
Lung Disease and Regeneration
This preprint list compiles highlights from the field of lung biology.
List by | Rob Hynds |
MitoList
This list of preprints is focused on work expanding our knowledge on mitochondria in any organism, tissue or cell type, from the normal biology to the pathology.
List by | Sandra Franco Iborra |