Menu

Close

Developmental regulation of Canonical and small ORF translation from mRNA

Pedro Patraquim, Jose I. Pueyo, M. Ali Mumtaz, Julie Aspden, J.P. Couso

Preprint posted on August 06, 2019 https://www.biorxiv.org/content/10.1101/727339v1

To be, or not to be translated. Patraquim and colleagues clarify the translational status of canonical and short ORFs during Drosophila embryogenesis.

Selected by Lorenzo Lafranchi

Background

Proteins, the building blocks of life, are produced by the decoding of open reading frames (ORFs) carried out by ribosomes. ORFs were historically identified as stretches of at least 100 (in-frame) codons encompassing the canonical start (AUG) and one of the stop codons. However, recent refinement of transcriptome-wide sequencing technologies, in particular ribosome profiling, revealed that large fractions of the prokaryotic and eukaryotic transcriptome undergo non-canonical translation. These regions include short ORFs (sORFs) and upstream ORFs (uORFs), whose translation has been shown to result in the production of functional and biologically relevant micropeptides smaller than 100 amino acids. Despite increasing experimental validations, the extent and role of non-canonical translation assessed by sequencing approaches is highly debated. In fact, the binding of ribosomes to mRNA does not always result in productive translation. Usually, ribosomal binding above a certain level and binding showing tri-nucleotide periodicity (framing) are used as indicators of productive translation. Ribosome profiling of polysomes is also accepted as a proof of productive translation. To reveal the translational control of both short and canonical ORFs during Drosophila melanogaster embryogenesis, the authors of this study refined their previously-published polysome sequencing pipeline to discriminate with high confidence between unproductive and productive translation.

 

Key findings

In this study, Patraquim and colleagues divided Drosophila embryogenesis in three temporal 8-hour windows and collected biological replicates for RNA sequencing and ribosome profiling. To ensure high accuracy of translational state assessment, the authors focused their analysis only on ORFs starting with the canonical start codon and discarded overlapping uORFs. In this way, development-specific transcription and translation of around 40’000 ORFs was evaluated. Overall, 98% of the canonical ORFs, 90% of the sORFs and 72% of the uORFs were bound to ribosomes at any time during development. Interestingly, ribosome occupancy was predicted to result in productive translation for 92% of the canonical ORFs and 73% of the sORFs, whereas only 13% of the uORFs seemed to possess ribosomes engaged in productive translation. When comparing their data to a proteomics dataset, the authors observed a modest but highly significant correlation between the two datasets. This indicates that ribosome binding, as analysed by ribosome footprinting, coupled to rigorous bioinformatic analysis, isa good proxy for protein-producing translation. One of the reasons for a modest correlation could be that proteomics approaches, differently than Ribo-seq, are not able to detect lowly-expressed proteins and micropeptides.

During the developmental process, 20% of canonical ORFs showed stage-specific transcription and translation, whereas 81% of uORFs and 43% sORF seemed to be expressed at specific stages. Calculation of the translational efficiency (the ratio between Ribo-Seq and RNA-Seq signal) of all ORFs across developmental stages revealed that the large stage-specific changes observed for uORFs and sORFs are mainly due to transcription. Nevertheless, it was also shown that uORFs and sORFs still seem to be more prone to stage-specific translational control than canonical ORFs.

Previous studies highlighted that uORFs can act as repressors of the canonical main ORF present in the same mRNA. This cis-regulatory role of uORFs was proposed to be purely based on ribosome binding, without leading to productive translation. The role of uORFs in regulating downstream translation is corroborated by the data presented in this manuscript, showing that, in spite of being abundantly bound by ribosomes, only a small fraction of uORFs seems to produce micropeptides. Interestingly, extensive non-productive ribosome binding also occurs at canonical ORFs and this seems to be a specific feature of the maternal to zygotic transition.

Supporting the idea that uORFs appear randomly in 5’ leader sequences, the authors found a positive correlation between uORF number and the length of the 5’ leader sequence. Various previous studies suggested that the number of uORFs present on the leader sequence of a specific transcript negatively correlates with expression of the main ORF. However, the authors could not see such a trend in their dataset. Instead, by studying the effect of uORFs on the translational efficiency (TE) of the main ORF, the authors observed that the majority of uORFs showed coordinated changes, both regarding down- and upregulation, in TE with their main ORFs. Based on this observation uORFs can be seen as positive or negative regulators of translation, but also as simply bystanders without an active role on translation of the main ORF. Finally, the extent of TE variation of the main ORF diminishes as the number of ribosome-bound uORFs increases in its 5’ UTR, suggesting a role for uORFs in stabilizing translational efficiency of the main ORF.

Due to the large number of uORFs considered, the small fraction of uORFs undergoing productive translation could result in a significant amount of novel micropeptides (1’489). 81% of these uORFs are translated in a stage-specific manner during Drosophila embryogenesis, indicating that the encoded micropeptides could be biologically relevant for specific cellular events. Finally, the authors calculated sequence conservation and amino acid usage of ribosome-bound-only uORFs, translated uORFs and canonical ORFs. Interestingly, translated uORFs are positioning between ribosome-bound-only uORFs and canonical ORFs in this regard, suggesting that translated uORFs could be in an evolutionary transition from non-coding towards protein-coding sequences.

 

What I like about this work and future directions

Extensive non-canonical translation has been suggested by transcriptome-wide sequencing approaches, nevertheless it is always difficult to discriminate whether binding of a ribosome to a specific sequence results in productive protein synthesis or not. Patraquim and colleagues developed a robust bioinformatic pipeline that enabled them to discriminate between productive translation and ribosome binding without protein production. Other than this, data presented in this study support a model of protein evolution, in which uORFs and sORF encode a pool of evolutionary flexible sequences that cells can use to develop novel functionalities. Nevertheless, as the authors nicely discuss, high-throughput techniques do not reveal protein function only protein expression. Therefore, experimental validation and functional characterization of putative micropeptides is required.

 

Questions

How many of the translated uORFs and sORFs described in this study are novel?

Are there common aspects between a subclass of sORFs and translated uORFs? Or are these two “families” completely unrelated?

Can you speculate on how many of the putative uORFs and sORFs could lead to the production of functional micropeptides?

Are the putative uORFs and sORFs described here conserved in other organisms? Do you think cross-species conservation is a stronger proof for a micropeptide to be functional?

 

Tags: microprotein, sep, smorf, sorf

Posted on: 9th September 2019

Read preprint (No Ratings Yet)




  • Have your say

    Your email address will not be published. Required fields are marked *

    This site uses Akismet to reduce spam. Learn how your comment data is processed.

    Sign up to customise the site to your preferences and to receive alerts

    Register here
    Close