Self-reporting transposons enable simultaneous readout of gene expression and transcription factor binding in single cells
Preprint posted on February 01, 2019 https://www.biorxiv.org/content/10.1101/538553v1
Making a mark on gene regulatory networks: A method using transposon insertions directed by a transcription factor allows simultaneous mapping of transcription factor binding sites and gene expression in single cells.James Briscoe
One of the aspects of preprints that I’ve found particularly useful is the rapid communication of innovative new methods. This is particularly true in fast moving fields such as single cell assays. I think the work by Moudgil et al is a good example of this.
At the heart of developmental mechanisms are gene regulatory networks – collections of transcriptional regulators that interact with each other, through the cis-regulatory elements they bind, to control gene expression and hence cell identity and function. Developing methods that allow the simultaneous assay, in individual cells, of the transcriptome and the genomic binding pattern of specific transcription factors (TF) would offer new insight into gene regulatory networks. In this preprint, Moudgil et al develop a method to do just this.
To identify TF binding sites the authors previously described a technique that is based on a fusion between a TF of interest and a transposase . The TF-transposase chimera is introduced into cells along with a reporter-harbouring transposon. As a result, the TF-transposase targets deposition of the reporter-transposon to DNA near the TF binding sites. The authors refer to these insertions as “calling cards” that can be amplified from chromatin and the locations determined by high-throughput sequencing of genomic DNA.
To make the technique compatible with transcriptome assays, the authors extended the technique by developing “self-reporting transposons (SRTs)”. To do this they removed the polyadenylation signal from the reporter-transposon and added a ribozyme after the terminal repeat to minimize reads from the non-integrated reporter. These clever tricks allow transcription of the reporter gene through the transposon into the flanking genomic sequence. The location of an insertion event can then be identified from mRNA by the sequence of the 3’ untranslated regions (UTRs) of reporter gene transcripts.
The authors first validate the technique in populations of cells transfected with the transcription factor SP1 fused to the transposase by demonstrating that calling cards sequenced from mRNA UTRs overlap with positions in the genome that are known to bind SP1. The transposase used, piggyBac, naturally interacts with the bromodomain protein BRD4, which itself associates with acetylated histones and active enhancers. The authors turn this potential bug into a feature by demonstrating that, if not fused to a specific TF, piggyBac can be used to map BRD4 bound regions in the genome using the SRT technique.
Finally, the authors modify their protocols to apply it to single cells – scCC (single cell Calling Cards). mRNA from single cells transfected with the SRT system was used to recover both the call cards, indicating the locations of transposon insertions, and the transcriptome. The cell barcodes from the single cell library preparation allow transcriptome data to be paired with call cards and hence transposon insertions assigned to specific cell types. Proof of principle experiments in cell lines were followed by mapping of Brd4 binding and gene expression in individual cells from the mouse cortex. They demonstrate differential Brd4 binding in excitatory neurons located in different layers of the cortex, providing evidence that SRT can be used to map transcriptional regulators in situ.
Why I like the preprint
I think the approach is elegant and original with the potential for further refinement. It has similarities with some other recently developed techniques, such as targeted DamID (e.g.), which identifies TF binding sites using a TF fused to DNA methyltransferase that methylates GATCs in the neighbourhood of binding. But scCC allows the simultaneous recovering of mRNA as well as TF binding location in individual cells, this has the potential to infer the link between TF binding and gene regulation. Moreover, the approach is of broad interest as it is sufficiently flexible to apply to almost any TF, in any cell type or tissue, in any species.
Many modifications to the system can be imagined. The authors mention the idea of using calling card insertions as molecular records of cell lineage or specific cellular events. Another possibility would be to use two or more distinguishable reporter-transposons, introduced at different times, to examine temporal changes in binding. It’s also possible to imagine using orthogonal transposase-transposon pairs to simultaneously monitor the binding of two TFs in the same cell.
Questions and open issues
As the authors point out, potential limitations are the sparsity of the data from single cells and the inherent bias in the insertion preferences of the transposase. In this context, it would be interesting to know if enhancers that produce eRNAs (~25% of enhancers) might be captured more frequently. Tweaks to the system and scaling up the datasets could address some of the shortcomings. Ultimately, analyzing the correlation between TF binding and the activity of individual genes in populations of many single cells would offer fantastically rich datasets from which to make gene regulatory predictions.
The current system relies on the ectopic expression of the TF-transposase fusion protein. Whether this results in aberrant binding of the TF to sites not normally occupied or whether the expression of the TF-transposase fusion protein has dominant effects that alter the state of the cells will depend on the details of the TF and cell types. Developing the system to allow the regulated expression from an endogenous gene would be one way around this limitation.
In the study, the authors use the interaction between PiggyBac and Brd4 to their advantage. However, this could also be a limitation as it might result in a high background or confounding results that complicate the identification of binding events that are specific to a TF of interest. It was unclear how much the Sp1-PiggBac fusion protein is recruited to Brd4 bound sites. Modifications that reduce or abrogate the Brd4 interaction, perhaps by using alternative transposases, would eliminate this concern. In addition, I’d be interested in seeing a comparison between BRD4 binding sites identified in the scCC system and other techniques, such as scATACseq, that mark accessible chromatin, to see how much overlap there is.
- Wang H, Mayhew D, Chen X, Johnston M, Mitra RD. (2012) “Calling cards” for DNA-binding proteins in mammalian cells. Genetics. 190:941-9
- Cheetham SW, Gruhn WH, van den Ameele J, Krautz R, Southall TD, Kobayashi T, Surani MA, Brand AH. (2018) Targeted DamID reveals differential binding of mammalian pluripotency factors. Development 145:dev170209
Posted on: 20th February 2019Read preprint
Also in the genomics category:
A novel metric reveals previously unrecognized distortion in dimensionality reduction of scRNA-Seq data
|Selected by||Suraj Kannan|
Estimates of genetic load in small populations suggest extensive purging of deleterious alleles
|Selected by||Neetha Iyer|
A phylogenomic analysis of Nepenthes (Nepenthaceae)
|Selected by||Edi Sudianto|