Single cell sequencing of the small and AT-skewed genome of malaria parasites

Shiwei Liu, Adam C. Huckaby, Audrey C. Brown, Christopher C. Moore, Ian Burbulis, Michael J. McConnell, Jennifer L. Güler

Preprint posted on June 21, 2020

Tools for studying the AT-skewed genome of malaria parasites.

Selected by Mariana De Niz


Anti-malarial drug efficacy is threatened by the frequent emergence of resistant populations. One of the major sources of genomic variation in P. falciparum that contribute to antimalarial resistance are copy number variations (CNVs), or the amplification and deletion of a genomic region. Namely, a high rate of CNVs may initiate genomic changes that contribute to the rapid adaptation of an organism. Despite the importance of CNVs, relatively little is known about them. The majority of CNVs in P. falciparum have been identified by analysing bulk DNA, whereby the CNVs are present in a large fraction of individual parasites in the population. However, it is likely that many CNVs remain undetected because of their presence at low frequency. Although various methods have improved the detection of low frequency CNVs, they still don’t offer complete CNV detection. Recently, single cell analysis to detect low frequency CNVs within heterogeneous populations has proven to offer advantages in the detection of rare genetic variants that might be overlooked in average populations. Nevertheless, short read sequencing requires much more genomic material for library construction than the genomic content of individual Plasmodium cells. This means whole genome amplification (WGA) is required to generate sufficient DNA quantities. However, most techniques are designed for mammalian cells. To date, the detection of CNVs in single P. falciparum parasites using whole genome sequencing has not been achieved, and the application of existing WGA methods has been hindered by the parasite’s small genome size and imbalanced base composition. Recognizing that an effective P. falciparum WGA method must be highly sensitive and able to handle imbalanced base composition, Liu et al (1)  present here a single cell sequencing pipeline which includes efficient isolation of single infected erythrocytes, an optimized WGA step inspired by a technique called “multiple annealing and looping-based amplification cycling” (MALBAC), and a sensitive method of assessing sample quality prior to sequencing.

Figure 1. Experimental workflow for single P. falciparum-infected erythrocyte isolation, amplification, and sequencing. Visualization in CellRaft AIR system using microscopy (From Ref1.)

Key findings and developments

The authors began by determining whether Plasmodium falciparum genomes from single-infected erythrocytes could be amplified by MALBAC. The sequencing pipeline included stage-specific parasite enrichment, isolation of single infected erythrocytes, cell lysis, whole genome amplification, pre-sequencing quality control, whole genome sequencing, and analysis steps. For this pipeline, the authors used both a laboratory line (Dd2), or a clinical sample. For single cell isolation, they used the microscopy-based CellRaft Air system, which has the benefit of low capture volume and visual confirmation of parasite stages. On isolated samples they then applied either the standard MALBAC protocol, or an alternative version optimized for the small AT-rich P. falciparum genome. This latter method allowed amplification of 43% of the early and 90% of the late stage laboratory samples, and 100% of the clinical samples. The authors then used droplet digital PCR (ddPCR) to assess the quality of WGA products from single cells, and together with a ‘uniformity score’, then selected the genomes that had been more evenly amplified. Samples amplified with the optimized MALBAC protocol were more evenly covered than those using the standard protocol. Altogether, the authors confirmed the validity of using ddPCR detection as a quality control step prior to sequencing.

The authors showed that the optimized MALBAC method limits contamination of single cell samples. They first showed that in the clinical bulk DNA, human contamination was higher than in the laboratory Dd2 bulk DNA, as expected. The second most common source of contamination was of bacterial origin. The optimized MALBAC protocol reduced the amplification bias towards contaminating human and bacterial genomes.

As a next step, as GC-bias during the amplification step can limit which areas of the genome are sequenced, the authors evaluated whether the optimized MALBAC improved genome coverage. The authors evaluated GC bias at various steps of the pipeline, and found that the optimized MALBAC reduces amplification bias of single cell samples. The authors discuss that although the optimized MALBAC showed less bias towards GC-rich sequences, it was still problematic for highly AT-rich and repetitive intergenic regions.

Next, the authors went on to explore the uniformity of read abundance distributed over the P. falciparum genome, and found that there was improvement in levels of read uniformity across the genome when using optimized MALBAC over the standard protocol. Furthermore, the optimized MALBAC protocol was found to exhibit reproducible coverage of single cell genomes.

The authors then compared the results obtained by the optimized MALBAC protocol and an MDA-based study – which has been the only other method used to amplify single Plasmodium genomes. While MALBAC-amplified genomes exhibited a consistent amplification pattern, the MDA-amplified genomes showed substantially more variation across cells. Moreover, the correlation between MDA-amplified cells was much lower than that between the optimized MALBAC-amplified cells. However, MDA-amplified samples displayed a higher coverage breadth across the genome, especially in the intergenic regions. The authors conclude altogether than the main benefit of MALBAC over MDA-based amplification of single cell genomes is reproducible coverage with low variation.

Finally, the authors were able to perform CNV analysis in MALBAC-amplified single cell genomes through the combination of discordant/split reads and read depth analysis methods. They also further explored parameters that impacted their detection. They found that different methods exhibited differences in the ability to identify true CNVs, and they attribute this to factors including CNV size, genomic neighbourhood, and sequencing depth. In general, most of the CNVs detected by both discordant/split read and read depth analyses were spread across all but one chromosome.

The authors conclude that building on these methodological improvements will enable detection of parasite-to-parasite heterogeneity to clarify the role of genetic variations in the adaptation of P. falciparum.


What I like about this preprint

This work was presented at the Woods Hole MPM virtual meeting in 2020, and was well received by the parasitology community. I like the preprint because it aims at bridging existing methodological gaps to understand single cell heterogeneity among parasites – a topic with current great momentum in the parasitology community. It specifically addresses complications faced in the P. falciparum AT-rich genome, and could open further opportunities for single cell analysis in this and other parasites.



  1. Liu et al, Single cell sequencing of the small and AT-skewed genome of malaria parasites, bioRxiv, 2020.


Posted on: 10th December 2020 , updated on: 23rd December 2020


Read preprint (No Ratings Yet)

Author's response

Jennifer Guler shared

Open questions

1.You discuss here the applicability, advantage and limitations of the optimized protocol you have investigated for single cell sequencing of blood-stage Plasmodium falciparum parasites. What further limitations and attributes do you envisage for the application of this method to other Plasmodium species?

We were targeting Plasmodium falciparum parasites, which have a high AT-content genome (~80% overall); we reasoned that higher AT-content primers of the optimized protocol would amplify this genome better. While these primers will likely also work on other Plasmodium species that do not have such a high AT-content (i.e. P. vivax and Pknowlesi, ~60% AT), it may be necessary to modify the primers to maximize amplification efficiency for single cell genomics. Additionally, isolation procedures should avoid parasites that are infecting nucleated host cells (see Q2).

2.Following from the question above, what are considerations to take into account upon using this method for single cell analysis of other parasite stages, such as the liver stages or the mosquito stages?

One main benefit that we have with single cell blood stage analysis is that the human host red blood cell does not contain a genome of its own. Therefore, barring any environmental contamination, you are only dealing with DNA from the parasite. For liver stages, where the parasite develops within a nucleated liver cell, the genome amplification step will also amplify the host’s genome. Because of its large size (100x larger than the parasite’s single genome), the host genome will likely be preferred over the parasite’s genome. This may be a good system to test if our improved MALBAC method specifically amplifies the AT-rich parasite genome over the host genome; however, it should be performed at an early stage of development, before parasite number expands (mature liver schizonts can contain many thousands of parasites).

Single cell genomics on mosquito stages are likely to be more feasible, especially with forms that live extracellularly. For example, assuming aseptic collection, amplification of the sporozoite genome should be equivalent to our experiments with the ring stage parasites (1n). Other stages (i.e. those in the gut) may be complicated by the presence of host mosquito cells or the microbiome.

3.While for me it was very clear how you used the method to study very discrete stages (ie early, middle and late stages) how would you use the method to specifically study changes in transition stages ie. between rings and trophozoites, trophozoites and schizonts, etc?

Single cell genomics is looking at the DNA sequence of the parasite, which for the most part only changes in copy number between different stages; rings have one copy (1n) and during trophozoite stage, the parasite begins to replicate its genome over multiple rounds to yield >10 copies in the late trophozoite/schizont stage. To analyse truly single parasite genomes, we need to isolate ring stages prior to the initiation of replication. Additionally, amplification and analysis of late stage parasites using these methods will assist in the detection of minor alleles that may be hidden during standard bulk sequencing approaches.

4.What advantages and limitations do you envisage in the use of this method for studying gametocytogenesis?

Gametocytes develop from blood stage parasites and most stages are within the anucleate red blood cell. Additionally, whether female or male, they contain only one copy of the genome. For these reasons, they should be able to be isolated and amplified in a similar way as blood stage parasites. New technology to introduce sex-specific fluorescent tags, combined with single cell genomics approaches, could rapidly improve knowledge about genomic differences between blood and gametocyte forms (i.e. changes in mitochondrial genome copies).

5.Finally, do you see this optimized method as being applicable to other parasites- both intracellular and extracellular?

We think our work is encouraging for single cell genomics on microbes that have particularly small genomes or those with different base compositions (either GC or AT-rich). By changing the volume of the reaction and the base-content of the primers, other researchers can optimize amplification methods for their needs. Additionally, due to reduced amplification bias over other methods, genome amplification using MALBAC can facilitate the identification of structural variations in microbial genomes; this is something that cannot currently be appreciated. Finally, although we did not explore this, our results indicate that it may be possible to preferentially amplify an intracellular parasite genome over the host genome.

Have your say

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Sign up to customise the site to your preferences and to receive alerts

Register here