Conserved phosphorylation hotspots in eukaryotic protein domain families

Marta J Strumillo, Michaela Oplova, Cristina Vieitez, David Ochoa, Mohammed Shahraz, Bede P Busby, Richelle Sopko, Romain A Studer, Norbert Perrimon, Vikram G Panse, Pedro Beltrao

Preprint posted on August 13, 2018

Rules of the game: extensive comparative study reveals phosphorylation hotspots in key eukaryotic protein domains

Selected by Gautam Dey


Post-translational modifications (PTMs), and protein phosphorylation in particular, serve as the cell’s precision rewiring tools. The addition (or removal) of a single charged phosphate group can alter a protein’s enzymatic activity, drive a binding partner switch, affect folding and stability, or force a change in subcellular location- all in a matter of seconds!

Advances in mass spectrometry have made it possible to map sites of protein phosphorylation on a massive scale. In recent years, this has produced an overwhelming wealth of genome-wide data across a range of cell types and species- along with new analytical challenges. Take the human genome: of approximately 160,000 non-redundant phosphosites1, only a tiny fraction is annotated, and many might not actually be functional2, with little or no contribution to fitness. How, then, to assess the functional relevance of all this data?

Evolutionary comparisons can help, based on the argument that highly conserved phosphosites are likely to be functional. However, these comparisons are not without their own challenges. Many functional phosphosites lie within unstructured regions of proteins, presumably relaxing selective constraints on their positions. Further complicating such analyses, individual phosphosites have the capacity to flip to acidic residues (and back) on relatively short evolutionary timescales3, rewiring signaling cascades in the process4.

Building on work initiated when he was a postdoc at UCSF2, Pedro Beltrao and colleagues circumvent these challenges to generate what is, to my knowledge, the most comprehensive comparative analysis of phosphosites, encompassing more than 500,000 phosphosites across 40 eukaryotic genomes.


Major findings 

The authors selected a subset 344 well-represented Pfam domain families with a high density of phosphorylation sites. Using a rolling window to account for alignment and assignment errors, and a background expectation generated by randomly permuting phosphosites to equivalent residues within the same sequence, resulted in the identification of significant “hotspots” within 162 of the 344 families. Encouragingly, the hotspots were enriched for known functional phosphosites- and once mapped onto structural models, recovered well-characterized regulatory motifs (Figure 1).

Figure 1: Reproduced from Figure 2 of Strumillo et al. 2018 under a CC-BY-NC-ND 4.0 international license. Enrichment over random of protein phosphorylation along the domain sequence, shown here for 4 domains. The average number of phosphosites observed per rolling window is plotted in a solid black line (observed). The background level of expected phosphorylation calculated from random sampling is shown in gray line, with standard deviations as gray band. The blue line represents the negative logarithm of p-value at each position (right y axis). A horizontal red line indicates a cut-off of the Bonferroni corrected p-value of 0.01. Positions with a -log(p-value) above this cut-off and average phosphosites per window higher than 2 are considered putative regulatory regions and highlighted under a vertical yellow bar. Red circles indicate human phosphosite positions with known regulatory function. In the structural representations, the predicted hotspot regions are highlighted in yellow.


The authors then looked to generalize their analyses, and found that the hotspots tend to be located proximal to catalytic residues or binding interfaces- across a broad range of domain families. Zooming in, they make experimentally tractable functional predictions for uncharacterised hotspots in two enzymes- IMP dehydrogenase and transaldolase. Finally, they selected two phosphorylation sites within a budding yeast ribosomal S11 domain hotspot for an experimental case study of their own, demonstrating a functional role for one of them.


What’s next?

I think it’s quite clear that this study has generated a fantastic resource for the larger community, although- as the authors are quick to point out in their discussion- follow-up structural biology analyses will not necessarily be straightforward. To end with a couple of open-ended questions:

  • Could the hotspot database be flipped around to help answer evolutionary questions? For example, are there any systematic differences between the organisation of hotspots in apicomplexan parasites and their free-living relatives? Or between multicellular and unicellular organisms?
  • How far are we from a bottoms-up, synthetic biology approach to designing protein switches or circuits controlled by phosphorylation?
  • On an even deeper evolutionary timescale: how many of the domain families are present in bacteria or archaea? It would be fascinating to extend the hotspot analysis beyond eukaryotes if feasible!



  1. Hornbeck, P. V. et al. PhosphoSitePlus, 2014: mutations, PTMs and recalibrations. Nucleic Acids Res. 43, D512–D520 (2015).
  2. Beltrao, P. et al. Systematic Functional Prioritization of Protein Posttranslational Modifications. Cell 150, 413–425 (2012).
  3. Pearlman, S. M., Serber, Z. & Ferrell, J. E. A mechanism for the evolution of phosphorylation sites. Cell 147, 934–46 (2011).
  4. Dey, G. & Meyer, T. Phylogenetic Profiling for Probing the Modular Architecture of the Human Genome. Cell Syst. 1, 106–115 (2015).


Tags: comparative genomics, phosphoproteomics

Posted on: 16th August 2018

Read preprint (No Ratings Yet)

  • Have your say

    Your email address will not be published. Required fields are marked *

    This site uses Akismet to reduce spam. Learn how your comment data is processed.

    Sign up to customise the site to your preferences and to receive alerts

    Register here

    Also in the evolutionary biology category:

    Evolution-guided design of super-restrictor antiviral proteins reveals a breadth-versus-specificity tradeoff

    Rossana S Colon-Thillet, Emily S Hsieh, Laura Graf, et al.

    Selected by Connor Rosen

    Establishment of the mayfly Cloeon dipterum as a new model system to investigate insect evolution

    Isabel Almudi, Carlos Martin-Blanco, Isabel Maria Garcia-Fernandez, et al.

    Selected by Ivan Candido-Ferreira


    Symmetry breaking in the embryonic skin triggers a directional and sequential front of competence during plumage patterning

    Richard Bailleul, Carole Desmarquet-Trin Dinh, Magdalena Hidalgo, et al.

    Selected by Alexa Sadier

    Bridging the divide: bacteria synthesizing archaeal membrane lipids

    Laura Villanueva, F. A. Bastiaan von Meijenfeldt, Alexander B. Westbye, et al.


    Extensive transfer of membrane lipid biosynthetic genes between Archaea and Bacteria

    Gareth A. Coleman, Richard D. Pancost, Tom A. Williams

    Selected by Gautam Dey


    PUMILIO hyperactivity drives premature aging of Norad-deficient mice

    Florian Kopp, Mehmet Yalvac, Beibei Chen, et al.

    Selected by Carmen Adriaens

    Eukaryotic acquisition of a bacterial operon

    Jacek Kominek, Drew T. Doering, Dana A. Opulente, et al.

    Selected by Lauren Neves

    millepattes micropeptides are an ancient developmental switch required for embryonic patterning

    Suparna Ray, Miriam I Rosenberg, Hélène Chanut-Delalande, et al.

    Selected by Erik Clark

    Peculiar features of the plastids of the colourless alga Euglena longa and photosynthetic euglenophytes unveiled by transcriptome analyses

    Kristina Zahonova, Zoltan Fussy, Erik Bircak, et al.

    Selected by Ellis O'Neill


    The Ly6/uPAR protein Bouncer is necessary and sufficient for species-specific fertilization

    Sarah Herberg, Krista R Gert, Alexander Schleiffer, et al.

    Selected by James Gagnon

    Timed collinear activation of Hox genes during gastrulation controls the avian forelimb position

    Chloe Moreau, Paolo Caldarelli, Didier Rocancourt, et al.

    Selected by Wouter Masselink

    The genomic basis of colour pattern polymorphism in the harlequin ladybird

    Mathieu Gautier, Junichi Yamaguchi, Julien Foucaud, et al.

    Selected by Fillip Port

    Altering the temporal regulation of one transcription factor drives sensory trade-offs

    Ariane Ramaekers, Simon Weinberger, Annelies Claeys, et al.

    Selected by Mariana R.P. Alves

    A robust method for transfection in choanoflagellates illuminates their cell biology and the ancestry of animal septins

    David Booth, Heather Middleton, Nicole King

    Selected by Maya Emmons-Bell

    A SoxB gene acts as an anterior gap gene and regulates posterior segment addition in the spider Parasteatoda tepidariorum

    Christian L. B. Paese, Anna Schoenauer, Daniel J. Leite, et al.

    Selected by Erik Clark


    Germ layer specific regulation of cell polarity and adhesion gives insight into the evolution of mesoderm.

    Miguel Salinas-Saavedra, Amber Q. Rock, Mark Q. Martindale

    Selected by ClaireS & SophieM


    Wnt/β-catenin regulates an ancient signaling network during zebrafish scale development

    Andrew J Aman, Alexis N Fulbright, David M Parichy

    Selected by Andreas van Impel

    Also in the systems biology category:

    Lineage tracing on transcriptional landscapes links state to fate during differentiation

    Caleb Weinreb, Alejo E Rodriguez-Fraticelli, Fernando D Camargo, et al.

    Selected by Yen-Chung Chen


    Short-range interactions govern cellular dynamics in microbial multi-genotype systems

    Alma Dal Co, Simon van Vliet, Daniel Johannes Kiviet, et al.


    Rapid microbial interaction network inference in microfluidic droplets

    Ryan H Hsu, Ryan L Clark, Jin Wei Tan, et al.

    Selected by Connor Rosen

    High-throughput functional analysis of lncRNA core promoters elucidates rules governing tissue-specificity

    Kaia Mattioli, Pieter-Jan Volders, Chiara Gerhardinger, et al.

    Selected by Clarice Hong

    Variability of bacterial behavior in the mammalian gut captured using a growth-linked single-cell synthetic gene oscillator

    David T Riglar, David L Richmond, Laurent Potvin-Trottier, et al.

    Selected by Meng Zhu

    Charting a tissue from single-cell transcriptomes

    Mor Nitzan, Nikos Karaiskos, Nir Friedman, et al.

    Selected by Irepan Salvador-Martinez

    Large-scale analyses of human microbiomes reveal thousands of small, novel genes and their predicted functions

    Hila Sberro, Nicholas Greenfield, Georgios Pavlopoulos, et al.

    Selected by Ganesh Kadamur

    Symmetry breaking in the embryonic skin triggers a directional and sequential front of competence during plumage patterning

    Richard Bailleul, Carole Desmarquet-Trin Dinh, Magdalena Hidalgo, et al.

    Selected by Alexa Sadier

    RNase L reprograms translation by widespread mRNA turnover escaped by antiviral mRNAs

    James M Burke, Stephanie L Moon, Evan T Lester, et al.

    Selected by Connor Rosen

    Acquired interbacterial defense systems protect against interspecies antagonism in the human gut microbiome

    Benjamin D. Ross, Adrian J. Verster, Matthew C. Radey, et al.

    Selected by Connor Rosen

    DNA microscopy: Optics-free spatio-genetic imaging by a stand-alone chemical reaction

    Joshua A. Weinstein, Aviv Regev, Feng Zhang

    Selected by Theo Sanderson


    The Toll pathway inhibits tissue growth and regulates cell fitness in an infection-dependent manner

    Federico Germani, Daniel Hain, Denise Sternlicht, et al.

    Selected by Rohan Khadilkar

    LCM-seq reveals unique transcriptional adaption mechanisms of resistant neurons in spinal muscular atrophy

    Susanne Nichterwitz, Helena Storvall, Jik Nijssen, et al.


    Axon-seq decodes the motor axon transcriptome and its modulation in response to ALS

    Jik Nijssen, Julio Cesar Aguila Benitez, Rein Hoogstraaten, et al.

    Selected by Yen-Chung Chen

    Memory sequencing reveals heritable single cell gene expression programs associated with distinct cellular behaviors

    Sydney M Shaffer, Benjamin L Emert, Ann E. Sizemore, et al.

    Selected by Leighton Daigh


    Conserved phosphorylation hotspots in eukaryotic protein domain families

    Marta J Strumillo, Michaela Oplova, Cristina Vieitez, et al.

    Selected by Gautam Dey

    LADL: Light-activated dynamic looping for endogenous gene expression control

    Mayuri Rege, Ji Hun Kim, Jacqueline Valeri, et al.

    Selected by Ivan Candido-Ferreira

    A minimal "push-pull" bistability model explains oscillations between quiescent and proliferative cell states.

    Sandeep Krishna, Sunil Laxman

    Selected by Lauren Neves