From Structure to Sequence: Identification of polyclonal antibody families using cryoEM

Aleksandar Anatanasijevic, Charles A. Bowman, Robert N. Kirchdoerfer, Christopher A. Cottrell, Gabriel Ozorowski, Amit A. Upadhyay, Kimberly M. Cirelli, Diane G. Carnathan, Chiamaka A. Enemuo, Leigh M. Sewall, Bartek Nogal, Fangzhu Zhao, Bettina Groschel, William R. Schief, Devin Sok, Guido Silvestri, Shane Crotty, Steven E. Bosinger, Andrew B. Ward

Preprint posted on 14 April 2021

A combination of next-generation sequencing (NGS) and structure-based analysis to identify possible heavy and light chain sequences based on electron cryo-microscopy maps without the requirement for single cell B cell or individual sequence isolation

Selected by Matthew Burke

From Structure to Sequence: Identification of polyclonal antibody families using cryoEM


The isolation of monoclonal antibodies (mAbs) is an obvious bottleneck when wanting to assess antibody responses to natural infection or vaccination. The subcloning, identification and characterisation of mAbs from single cells is a labour-intensive task that initially offers little insight into the specifics of the paratope-epitope interaction. Here, Anatanasijevic et al., have used a combination of next-generation sequencing (NGS) and structure-based analysis to identify possible heavy and light chain sequences based on electron cryo-microscopy maps without the requirement for single cell B cell or individual sequence isolation. This present study is the progression of the group’s recent works (Bianchi et al., 2018; Nogal et al., 2020; Antanasijevic et al., 2021), in which they first introduced the cryoEM-based method for characterisation of polyclonal antibody responses (cryoEMPEM).

Key Findings

Proof of principle: Antibody amino acid sequences can be modelled on cryoEMPEM maps to identify candidate clonal antibody members.

Previously, rhesus macaques were immunized with the stabilised HIV-1 Env immunogen BG505 SOSIP. Serum was harvested and used to generate three <4 Å cryoEMPEM maps of polyclonal Fab binding to separate sites on the antigen. These maps were Rh.4O9 pAbC-1, Rh.33104 pAbC-1 and Rh.33172 pAbC-2. As a proof of principle, two antibodies from the same rhesus macaque from which the Rh.4O9 pAbC-1 cryoEMPEM map was built were isolated and their amino acid sequences analysed. These antibodies, Rh.4O9.7 and Rh.4O9.8, both targeted the V1 loop of BG505 SOSIP, and the latter superimposed the polyclonal Fab generated by the cryoEMPEM map when analysed by negative stain EM (nsEM), suggesting it potentially shares a comparable binding modality with the computationally identified polyclonal response. The amino acid sequence of Rh.4O9.8 mAb was used to build an atomic model into the Rh.4O9 pAbC-1 cryoEM map. This mAb model displayed excellent agreement with the experimental cryoEM map of the polyclonal V1-targeting Fab, suggesting Rh.4O9.8 was likely a clonal member of this lineage. Overall, this proof of principle suggests that the structural information from underlying monoclonal antibodies is preserved in the polyclonal antibody maps obtained by cryoEMPEM and raises the possibility that this structural information can be used to identify the sequences of unknown monoclonal antibodies from cryoEMPEM maps.

Generation of a structure-based sequence prediction algorithm

The polyclonal Fab cryoEMPEM maps are of sufficiently high resolution (<4 Å) to place structural constraints as to what amino acids are likely to be located at any specific site on the protein structure, largely based on the density volume obtained at that site in the map. Here, Anatanasijevic et al., categorised amino acids based on their properties and generated an assignment system that determined the subset of amino acids that best matched the density for the cryoEMPEM map. Homology modelling to published monoclonal antibody structures and corresponding sequences was used to identify framework regions (FR) and complementarity-determining regions (CDR) lengths. The query sequence consisting of amino acid category identifiers could then be used to search the recovered amino acid antibody sequence database (acquired by NGS of B-cells isolated at a corresponding time point) for the best matching heavy and light chain candidates, based on matching CDR length and overall alignment score, to attempt to identify clonal antibodies of the desired lineage.

Combining NGS and cryoEMPEM can successfully identify clonal members of an antibody lineage from antigen-specific B cells.

This sequence prediction was then applied to the two other immunized rhesus macaques from which cryoEMPEM maps had previously been constructed, Rh.33104 pAbC-1 and Rh.33172 pAbC-2 (Nogal et al., 2020; Antanasijevic et al., 2021). A library of novel antibody amino acid sequences was generated from germinal centre BG505-specific B cells of these macaques and ranked according to how well they matched the respective cryoEMPEM maps. An emphasis was placed on the complementarity-determining regions (CDR) as these differ the most between different antibody clonotypes. Although between 4 and 18% of heavy and light chain residues were mismatched between the assigned potential amino acids, two mAbs generated from the best matching candidate heavy/light chains expressed as IgG and Fab-fragments, and bound BG505 SOSIP with nanomolar affinity. Further cryoEM validation of these novel clonal antibodies binding to antigen confirmed they bound to the same epitopes as the respective polyclonal antibodies from the cryoEMPEM maps. Overall, this strongly suggests the newly identified mAbs are members of the polyclonal lineage detected by cryoEMPEM.


This exciting development in mAb discovery will be a useful tool during natural infection and vaccination studies. The rapid screening of the polyclonal sera generated by different immunogens via this method could allow the assessment of their propensity to induce antibodies against specific, desirable epitopes and the subsequent rapid isolation of these antibodies. This tool would clearly be an excellent complement to traditional assessments of the B cell immune response such as measurements of the levels of antigen-specific B cells or the levels of sera neutralization.




Posted on: 20 April 2021 , updated on: 21 April 2021


Read preprint (No Ratings Yet)

Questions for Authors

Andrew Ward and Aleksandar Anatanasijevic. shared

Several notable antibody lineages, such as those directed towards certain epitopes on HIV-1 env, have unusual features. These include extended CDR H3 over 30 amino acids and CDR H3-only binding. Could this approach be used to identify and model these antibodies?

Yes! In fact, one such antibody is Rh.33172 mAb.1 identified in this study (HCDR3 length of 22 aa). Unusual features, such as long CDR loops, are actually favourable for this analysis as they help limit the search space within the sequence database. One of the steps prior to sequence alignment is database filtration based on the length of CDR/FR regions determined from structural data (pAbC maps). For antibodies with unusual features, this filtration excludes the great majority of sequences from the NGS database, leaving only a small number of possible sequences and thereby facilitating the subsequent search.

Do you think this method could be useful in mapping or assessing how antigenic variants escape from neutralizing antibodies?

Yes! The cryoEMPEM method with sequence determination of underlying polyclonal antibody families can be readily applied to establish correlations with emerging mutations in antigenic variants and possible immune-escape routes. We are very excited about applying this technology to study how antibody responses elicited by various COVID vaccines react with SARS CoV-2 variants of concern. We can also apply our approach to study antibody-antigen co-evolution over the course of an infection.




Have your say

Your email address will not be published.

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Sign up to customise the site to your preferences and to receive alerts

Register here

preLists in the biochemistry category:

CellBio 2022 – An ASCB/EMBO Meeting

This preLists features preprints that were discussed and presented during the CellBio 2022 meeting in Washington, DC in December 2022.


List by Nadja Hümpfer et al.

20th “Genetics Workshops in Hungary”, Szeged (25th, September)

In this annual conference, Hungarian geneticists, biochemists and biotechnologists presented their works. Link:


List by Nándor Lipták


The advances in fibroblast biology preList explores the recent discoveries and preprints of the fibroblast world. Get ready to immerse yourself with this list created for fibroblasts aficionados and lovers, and beyond. Here, my goal is to include preprints of fibroblast biology, heterogeneity, fate, extracellular matrix, behavior, topography, single-cell atlases, spatial transcriptomics, and their matrix!


List by Osvaldo Contreras

ASCB EMBO Annual Meeting 2019

A collection of preprints presented at the 2019 ASCB EMBO Meeting in Washington, DC (December 7-11)


List by Madhuja Samaddar et al.

EMBL Seeing is Believing – Imaging the Molecular Processes of Life

Preprints discussed at the 2019 edition of Seeing is Believing, at EMBL Heidelberg from the 9th-12th October 2019


List by Dey Lab

Cellular metabolism

A curated list of preprints related to cellular metabolism at Biorxiv by Pablo Ranea Robles from the Prelights community. Special interest on lipid metabolism, peroxisomes and mitochondria.


List by Pablo Ranea Robles


This list of preprints is focused on work expanding our knowledge on mitochondria in any organism, tissue or cell type, from the normal biology to the pathology.


List by Sandra Franco Iborra