Generative modeling of single-cell population time series for inferring cell differentiation landscapes

Grace H.T. Yeo, Sachit D. Saksena, David K. Gifford

Preprint posted on August 27, 2020

Learn Waddington’s landscape and predict perturbed cell fate by computationally rolling cells on the landscape

Selected by Yen-Chung Chen
The figure is adapted from figure 1c and 1f and licensed under a CC-BY 4.0 International license

Background and context

How a zygote gives rise to countless types of cells forming different tissues and organs remains the greatest mystery of development. With recent technical advances, we are now able to observe the process of progenitor cells becoming mature cells with unprecedented resolution. In particular, single-cell RNA-seq allows the capture of gene expression of thousands of developing cells, so we can depict development as a gradual change of cellular states defined by all the genes expressed in the cell.

To become a mature cell, a progenitor cell is first specified so it becomes biased to undergo differentiation toward a specific cell type, or in other words, biased to accept a certain cell fate. The cell fate consolidates during development, meaning cells lose the ability to become other cell types even with perturbation. This model predicts that if transcriptomic profiling faithfully captures development, there must be an early state of gene expression that corresponds to specification in which the gene expression is sufficient to guide the cell to differentiate autonomously to a mature cell type. Similarly, there would be a later state following specification in which the change in gene expression restricts the developmental potential so the cell can no longer accept other fate even with genes are mis-expressed.

Transcriptomic profiling studies often aim to match single-cell transcriptomes to the stages of development and to link fate specification and commitment to gene expression. Many analysis methods summarize single-cell transcriptomes during development as a branching process and seek gene expression differences between the branches in the process [Reviewed in 1]; some other analyses try to couple earlier and later transcriptomes, and by comparing the early and later distribution of gene expression, we are able to learn the probability of a given early state giving rise to each cell type [2, 3].

Although the aforementioned techniques have shown their power in describing and identifying gene expression changes in development that are critical for the specification and commitment of cell fate, there are a few unresolved challenges. First, when samples are collected from different developmental stages, it is often difficult to integrate the “real time” that sample collection time represents into the analysis. Additionally, while the developmental process can be reconstructed, it is not straightforward to find genes that play larger roles in committing to a cell fate.

This preprint features a generative neural network that learns the developmental landscape from single-cell transcriptomes collected at different developmental stages. The model consists of the slope at each point on the landscape. Development of a cell is modeled by its movement on the landscape, and the direction and velocity is determined by the slope and a random noise. With this model, it is possible to pick an arbitrary transcriptome on the landscape and simulate its future developmental trajectory.

Key findings

PRESCIENT predicts unseen intermediate cell states

The model (Potential eneRgy undErlying Single Cell gradIENTs, or PRESCIENT) was tested on a dataset of hematopoiesis [4], in which lineage information is provided so cell fate can be resolved by identifying cell lineages by genetically encoded “barcodes”. The barcoded cell fate information provided a gold standard to evaluate the performance of the model. When trained only on day 2 and day 6 of differentiation, PRESCIENT successfully predicted the unseen transcriptomes at day 4 with a smaller prediction error than the cell fate prediction method Waddington-OT [5].

PRESCIENT predicts cell fate from progenitor transcriptomes with greater power when proliferation information included

PRESCIENT was then trained on data at all time points and used to predict the cell fate from progenitor transcriptomes. The prediction is presented as fate bias, the probability of a given progenitor to become each cell type. Fate bias is estimated by simulating 2,000 trajectories with PRESCIENT to see the ratio of simulations ending up as either a neutrophil or monocyte. When cell proliferation information was not provided, PRESCIENT predicted fate with a similar power as existing methods including Waddington-OT and FateID, while PRESCIENT outperformed Waddington-OT and FateID with cell proliferation provided.

PRESCIENT provides a generalizable model that predicts cell fate on unseen data and identifies potential fate regulators

To test if PRESCIENT can be applied on unseen data, the authors compared its performance of fate prediction when trained only on a subset of data versus all data. Regardless of whether PRESCIENT was trained on all data or a subset of the data, the performance was similar.

The authors went on to ask whether PRESCIENT predicts the fate of a transcriptome in which the expression of known cell fate regulators were perturbed in silico. The authors altered the expression of genes known to promote neutrophil generation computationally, and provided the perturbed transcriptome to PRESCIENT. With over-expression of neutrophil promoting genes, PRESCIENT predicted an increase of production of neutrophils, while knocking down was predicted to decrease neutrophil production.

A similar simulation was performed for known monocyte promoting genes and on another dataset of pancreas development, and the predictions were consistent with the prior knowledge. Notably, in the pancreas dataset, PRESCIENT not only successfully predicted the fate bias from perturbation but also the competent stages (stages before commitment when the perturbation can change the cell fate).

Finally, the authors performed a computational perturbation of 200+ transcription factors in pancreas development to screen for those capable of inducing fate biases. The screen resulted in 28 candidates including known regulators of the specification alpha and beta cells.

Brief summary

Yeo and Saksena et al. developed PRESCIENT, which depicts development as a landscape on which the cell moves according to the slope and random noise. The slope of the development landscape is learned with a neural network trained on gene expression at different developmental stages after dimensionality reduction with principal component analysis. This approach utilizes the sample collection time to represent real developmental time and is not dependent on inferring developmental “psuedotime” from the profiled cells. The generative nature of PRESCIENT makes it not only capable of identifying the progenitor-progeny relationship within the cells profiled, but also predict the terminal fate of a virtual progenitor that could be perturbed computationally.

Open questions

  1. It appears that the prediction power varies with the structure of the neural network and regularization strength. Would you suggest optimize these parameters per experiment (like different types of tissue or species)?
  2. PRESCIENT is implemented in the PCA space to learn the potential function (representing the force acting on the cell on a developmental landscape). Is the efficiency of PRESCIENT sensitive to the choice of dimensionality reduction techniques (Like PCA vs ICA or manifold learning)? If so, is there a reason why PCA is preferred?
  3. PRESCIENT takes real time into consideration in an efficient way. Although this gives PRESCIENT more power, the information might be very hard to obtain in some context. For example, neurogenesis is asynchronous in many species, and progenitors and neurons can be collected together during a wide time window (around 4 days in mouse spinal cord for example [6]). Asynchronous differentiation might make real time to deviate from the developmental time of individual cells. Would PRESCIENT still be applicable in this scenario?


  1. Saelens, W., Cannoodt, R., Todorov, H., Saeys, Y. (2019). A comparison of single-cell trajectory inference methods Nature Biotechnology 37(5), 547-554.
  2. Schiebinger, G., Shu, J., Tabaka, M., Cleary, B., Subramanian, V., Solomon, A., Gould, J., Liu, S., Lin, S., Berube, P., Lee, L., Chen, J., Brumbaugh, J., Rigollet, P., Hochedlinger, K., Jaenisch, R., Regev, A., Lander, E. (2019). Optimal-Transport Analysis of Single-Cell Gene Expression Identifies Developmental Trajectories in Reprogramming Cell 176(4), 928-943.e22.
  3. Weinreb, C., Wolock, S., Tusi, B., Socolovsky, M., Klein, A. (2018). Fundamental limits on dynamic inference from single-cell snapshots Proceedings of the National Academy of Sciences 115(10), 201714723.
  4. Weinreb, C., Rodriguez-Fraticelli, A., Camargo, F., Klein, A. (2020). Lineage tracing on transcriptional landscapes links state to fate during differentiation. Science (New York, N.Y.) 367(6479), eaaw3381.
  5. Schiebinger, G., Shu, J., Tabaka, M., Cleary, B., Subramanian, V., Solomon, A., Gould, J., Liu, S., Lin, S., Berube, P., Lee, L., Chen, J., Brumbaugh, J., Rigollet, P., Hochedlinger, K., Jaenisch, R., Regev, A., Lander, E. (2019). Optimal-Transport Analysis of Single-Cell Gene Expression Identifies Developmental Trajectories in Reprogramming Cell 176(4), 928-943.e22.
  6. Delile, J., Rayon, T., Melchionda, M., Edwards, A., Briscoe, J., Sagner, A. (2019). Single cell transcriptomics reveals spatial and temporal dynamics of gene expression in the developing mouse spinal cord Development 146(12), dev173807.

Tags: neural network, single cell rna seq

Posted on: 6th November 2020


Read preprint (No Ratings Yet)

Have your say

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Sign up to customise the site to your preferences and to receive alerts

Register here

Also in the developmental biology category:

EMBL Conference: From functional genomics to systems biology

Preprints presented at the virtual EMBL conference "from functional genomics and systems biology", 16-19 November 2020


List by Jesus Victorino

Single Cell Biology 2020

A list of preprints mentioned at the Wellcome Genome Campus Single Cell Biology 2020 meeting.


List by Alex Eve

Society for Developmental Biology 79th Annual Meeting

Preprints at SDB 2020


List by Irepan Salvador-Martinez, Martin Estermann

FENS 2020

A collection of preprints presented during the virtual meeting of the Federation of European Neuroscience Societies (FENS) in 2020


List by Ana Dorrego-Rivas

Planar Cell Polarity – PCP

This preList contains preprints about the latest findings on Planar Cell Polarity (PCP) in various model organisms at the molecular, cellular and tissue levels.


List by Ana Dorrego-Rivas

Cell Polarity

Recent research from the field of cell polarity is summarized in this list of preprints. It comprises of studies focusing on various forms of cell polarity ranging from epithelial polarity, planar cell polarity to front-to-rear polarity.


List by Yamini Ravichandran

TAGC 2020

Preprints recently presented at the virtual Allied Genetics Conference, April 22-26, 2020. #TAGC20


List by Maiko Kitaoka, Madhuja Samaddar, Miguel V. Almeida, Sejal Davla, Jennifer Ann Black, Gautam Dey

3D Gastruloids

A curated list of preprints related to Gastruloids (in vitro models of early development obtained by 3D aggregation of embryonic cells). Preprint missing? Don't hesitate to let us know.


List by Paul Gerald L. Sanchez and Stefano Vianello

ASCB EMBO Annual Meeting 2019

A collection of preprints presented at the 2019 ASCB EMBO Meeting in Washington, DC (December 7-11)


List by Madhuja Samaddar, Ramona Jühlen, Amanda Haage, Laura McCormick, Maiko Kitaoka

EDBC Alicante 2019

Preprints presented at the European Developmental Biology Congress (EDBC) in Alicante, October 23-26 2019.


List by Sergio Menchero, Jesus Victorino, Teresa Rayon, Irepan Salvador-Martinez

EMBL Seeing is Believing – Imaging the Molecular Processes of Life

Preprints discussed at the 2019 edition of Seeing is Believing, at EMBL Heidelberg from the 9th-12th October 2019


List by Gautam Dey

SDB 78th Annual Meeting 2019

A curation of the preprints presented at the SDB meeting in Boston, July 26-30 2019. The preList will be updated throughout the duration of the meeting.


List by Alex Eve

Lung Disease and Regeneration

This preprint list compiles highlights from the field of lung biology.


List by Rob Hynds

Young Embryologist Network Conference 2019

Preprints presented at the Young Embryologist Network 2019 conference, 13 May, The Francis Crick Institute, London


List by Alex Eve

Pattern formation during development

The aim of this preList is to integrate results about the mechanisms that govern patterning during development, from genes implicated in the processes to theoritical models of pattern formation in nature.


List by Alexa Sadier

BSCB/BSDB Annual Meeting 2019

Preprints presented at the BSCB/BSDB Annual Meeting 2019


List by Gautam Dey

Zebrafish immunology

A compilation of cutting-edge research that uses the zebrafish as a model system to elucidate novel immunological mechanisms in health and disease.


List by Shikha Nayar