State-of-the-Art Estimation of Protein Model Accuracy using AlphaFold

James P. Roney, Sergey Ovchinnikov

Posted on: 13 April 2022 , updated on: 9 August 2023

Preprint posted on 24 March 2022

Article now published in Physical Review Letters at

AlphaFold may not be “just” a pattern recognition algorithm, but may actually have learnt about the energetics of protein folding.

Selected by Kieran Didi

Categories: bioinformatics, biophysics


The field of protein structure prediction was revolutionized last year when the DeepMind team, which won the 14th Critical Assessment of Structure Prediction (CASP14) competition, published the paper2 and the code for their AlphaFold (AF2) model. This major advance on the protein folding problem holds the promise for progress in many areas of biology and medicine, since protein structures are essential for research in e.g. drug discovery and protein engineering, but often only accessible via expensive and laborious experimental methods such as X-ray crystallography, cryo-electron microscopy and NMR spectroscopy. These methods have enabled scientists to elucidate more than 100,000 protein structures (available through the Protein Data Bank), but are costly and involve a lot of trial and error. Computational methods try to simplify this procedure by predicting the 3D structure of a linear protein sequence without experimentally determining it.

Early efforts at simplifying protein structure prediction aimed to capture the physics that govern protein folding and simulate the folding process to get an accurate structure; a prime example of this is the Rosetta software suite developed by David Baker and co-workers at the University of Washington in Seattle. In the 90s, coevolution information was recognized as a valuable input for protein structure prediction. For this, multiple sequence alignments (MSAs) between evolutionarily related proteins are constructed and spatial contacts are inferred based on coevolution of amino acids. Finally, the progress in the field of machine learning and especially deep learning also had an impact on the structural biology community, with huge models such as AlphaFold producing state-of-the-art protein structure predictions.

One open question regarding AlphaFold is whether the model learned something about the underlying physics of the protein folding problem or is “just” a pattern matching algorithm inherently dependent on the provided MSAs. Since proteins in nature fold astonishingly fast by themselves (a phenomenon known as Levinthal’s paradox) and some of them can refold after denaturation (as observed by Anfinsen3), the 3D structure must be encoded in the protein sequence alone. Anfinsen’s dogma, therefore, states that proteins fold as a result of free energy minimization. This free energy depends on the protein structure and is the one that earlier physics-based prediction tools tried to approximate (and that is, for example, still approximated in techniques like molecular dynamics simulations).

Figure 1 of the preprint: The hypothesized role of coevolutionary information in AlphaFold’s prediction procedure. According to this, AlphaFold implicitly learns an energy function of the protein conformational landscape.


In this preprint, Roney and Ovchinnikov address this question by testing the hypothesis that AlphaFold learned this energy function and uses coevolution information to find a good initial guess for an energy minimum in this conformation landscape, therefore understanding something about the underlying physics of the protein folding problem. They use this hypothesis to rewire AlphaFold in such a way that they can rank decoy protein structures with it, performing better than state-of-the-art (SOTA) models for this task.

Key findings                               

Use of AF2 for ranking candidate protein structures

During the structure prediction process, AlphaFold uses an MSA of the amino acid sequence of the target protein with related sequences as input. As an additional option, known protein structures close to the target protein sequence (known as templates) can be provided to improve prediction results. The model then outputs a predicted protein structure and two confidence metrics for this prediction: the predicted LDDT-Cα Score (pLDDT) and the predicted TM Score (pTM).

To change the objective of AlphaFold from predicting protein structures to ranking candidate structures, the authors made three adjustments. First, instead of providing known protein structures as templates, they provide a “decoy structure” that is a candidate structure for the target protein, e.g. one predicted by another model. Furthermore, they do not provide an MSA as input, but just the amino acid of the target protein, therefore stripping the model of the ability to use coevolutionary information. Lastly, they compute a new output metric called a “composite confidence score” based on the existing metrics: they multiply output pLDDT, output pTM and the TM Score between the structure predicted by AlphaFold and the decoy. The last term is needed since the main objective is not to assess the quality of the predicted structure, but the quality of the decoy structure that was given as a template.

The authors use this approach to rank decoys from the Rosetta decoy dataset, which contains 133 native protein structures along with thousands of decoy structure variants, and compare the performance with common decoy ranking tools such as Rosetta4 and the SOTA machine learning model DeepAccNet5. Their approach based on AlphaFold strongly outperforms Rosetta and DeepAccNet, both in terms of Spearman correlation of the confidence metric with decoy quality and in terms of top-1 accuracies of decoy structures.

Ranking quality independent of decoy’s amino acid sequence

The decoy structures the authors provide the model with masks out the side chains that help to increase the accuracy. Since the decoy structure now basically consists of the backbone and the Cβ-atoms only, any sequence of correct length could be fed into the model as input instead of the correct sequence of the target protein. The authors investigated the influence of this parameter by running their experiments with two different one-hot-encoded sequence inputs: both the true target sequence and an all-alanine sequence. They found that both choices deliver robust results on the Rosetta decoy dataset, with the all-alanine sequence performing better on the correlation metric and the correct target sequence performing better on the top-1 accuracies. The authors then used this result to further extend their hypothesis regarding the inner workings of their decoy ranking predictions: in the case of the target sequence input, this sequence and the masked-out sequence of the template are identical, and therefore the structural predictions probably very similar. Since the global geometry is thus quite similar, the confidence metrics used to compute the composite score are therefore more dependent on local fold features, delivering better results on the top-1 accuracies.

For the all-alanine sequence, the opposite is the case: due to very low sequence similarity, the global symmetry between decoy and prediction will be very different, causing the confidence metrics to be strongly influenced by the global fold and the model to perform better on the general correlation metric. Using a weighted hybrid approach, the authors were able to combine the strengths of both methods and outperform the results gained by providing either of the inputs alone.

Evaluation from CASP14: MSAs needed for accurate structure prediction, not for decoy ranking

To test their hypothesis on an independent data set, the authors used the CASP14 EMA (Estimation of Model Accuracy) tasks. Here, they show that AlphaFold is indeed able to rank decoys better than the top models from CASP14 without coevolution information, but still needs the MSAs to perform structure prediction itself. Without MSAs, it can rank predicted decoys reliably but performs poorly in producing structure predictions, further supporting the author’s hypothesis that coevolution information is used to provide a good initial guess on the learned energy landscape, from which the structure module performs local gradient descent to an energetic minimum.

Why I selected this preprint                  

The publication of AlphaFold had a major influence on the structural biology community and the life sciences in general by improving experimental structure predictions, as well as providing thousands of predicted structures to researchers around the world. However, more difficult problems such as protein design still pose a challenge. The main hypothesis of this preprint (i.e. that AlphaFold has learnt some kind of underlying energy function) presents a novel idea that suggests new angles from which challenging problems in structural biology can be tackled.

Questions for the authors

1. The preprint provides evidence for your hypothesis that AF2 learns an energy function for protein folding, but what other experiments could be used to support/falsify your hypothesis?

2. For protein structure prediction, the MSAs still seem indispensable. If your hypothesis is true, in what ways could this new insight be used for problems such as protein design/structure prediction for single sequences?


(1)       Roney, J. P.; Ovchinnikov, S. State-of-the-Art Estimation of Protein Model Accuracy Using AlphaFold. bioRxiv March 12, 2022, p 2022.03.11.484043.

(2)       Jumper, J.; Evans, R.; Pritzel, A.; Green, T.; Figurnov, M.; Ronneberger, O.; Tunyasuvunakool, K.; Bates, R.; Žídek, A.; Potapenko, A.; Bridgland, A.; Meyer, C.; Kohl, S. A. A.; Ballard, A. J.; Cowie, A.; Romera-Paredes, B.; Nikolov, S.; Jain, R.; Adler, J.; Back, T.; Petersen, S.; Reiman, D.; Clancy, E.; Zielinski, M.; Steinegger, M.; Pacholska, M.; Berghammer, T.; Bodenstein, S.; Silver, D.; Vinyals, O.; Senior, A. W.; Kavukcuoglu, K.; Kohli, P.; Hassabis, D. Highly Accurate Protein Structure Prediction with AlphaFold. Nature 2021, 596 (7873), 583–589.

(3)       Anfinsen, C. B.; Scheraga, H. A. Experimental and Theoretical Aspects of Protein Folding. In Advances in Protein Chemistry; Anfinsen, C. B., Edsall, J. T., Richards, F. M., Eds.; Academic Press, 1975; Vol. 29, pp 205–300.

(4)       Rubenstein, A. B.; Blacklock, K.; Nguyen, H.; Case, D. A.; Khare, S. D. Systematic Comparison of Amber and Rosetta Energy Functions for Protein Structure Evaluation. J. Chem. Theory Comput. 2018, 14 (11), 6015–6025.

(5)       Hiranuma, N.; Park, H.; Baek, M.; Anishchenko, I.; Dauparas, J.; Baker, D. Improved Protein Structure Refinement Guided by Deep Learning Based Accuracy Estimation. Nat. Commun. 2021, 12 (1), 1340.

Tags: alphafold, casp, protein structure prediction, rosetta


Read preprint (No Ratings Yet)

Author's response

James Roney shared

Thanks for sharing this with us! I think your writeup does a really good job summarizing our preprint, and I’m glad you found it interesting! The two questions you had at the end are very important, and we hope to address them robustly in the future. Here are some preliminary perspectives on those questions:

1. If AlphaFold has learned an energy function, we might expect it to be useful for other applications like predicting the effects of single mutations on protein stability, or for improving the accuracy of protein design. This suggests some new experiments that could be used to test the hypothesis we’ve proposed in the preprint, and future versions of our paper may contain some of these experiments.
2. Using the energy function learned by AlphaFold to predict protein structures from single sequences is a very exciting possibility opened up by the hypothesis we’ve proposed in our preprint. In theory, it should be possible to search over the space of possible decoy conformations to find structures that produce high-confidence outputs from AlphaFold. However, it is unclear whether this is computationally feasible in general, or if such a search might uncover adversarial structures that “trick” AlphaFold into being highly confident. In the latest version of our preprint, we’ve explored a simple approach to optimizing decoy structures in Appendix E. Essentially, we showed that a simple greedy optimization procedure can be used to improve the accuracy of AlphaFold’s MSA-free predictions for many protein targets. There’s still plenty of work to do to see if this approach can be improved upon and generalized, but we think it’s a very interesting proof-of-concept.

Have your say

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Sign up to customise the site to your preferences and to receive alerts

Register here

Also in the bioinformatics category:

Expressive modeling and fast simulation for dynamic compartments

Till Köster, Philipp Henning, Tom Warnke, et al.

Selected by 18 April 2024

Benjamin Dominik Maier

Systems Biology

Transcriptional profiling of human brain cortex identifies novel lncRNA-mediated networks dysregulated in amyotrophic lateral sclerosis

Alessandro Palma, Monica Ballarino

Selected by 16 April 2024

Julio Molina Pineda


Spatial transcriptomics elucidates medulla niche supporting germinal center response in myasthenia gravis thymoma

Yoshiaki Yasumizu, Makoto Kinoshita, Martin Jinye Zhang, et al.

Selected by 27 March 2024

Jessica Chevallier


Also in the biophysics category:

Topology changes of the regenerating Hydra define actin nematic defects as mechanical organizers of morphogenesis

Yamini Ravichandran, Matthias Vogg, Karsten Kruse, et al.

Selected by 08 May 2024

Rachel Mckeown

Developmental Biology

Structural basis of respiratory complexes adaptation to cold temperatures

Young-Cheul Shin, Pedro Latorre-Muro, Amina Djurabekova, et al.

Selected by 10 April 2024

Pamela Ornelas


Actin polymerization drives lumen formation in a human epiblast model

Dhiraj Indana, Andrei Zakharov, Youngbin Lim, et al.

Selected by 05 April 2024

Megane Rayer, Rivka Shapiro


preLists in the bioinformatics category:

‘In preprints’ from Development 2022-2023

A list of the preprints featured in Development's 'In preprints' articles between 2022-2023


List by Alex Eve, Katherine Brown

9th International Symposium on the Biology of Vertebrate Sex Determination

This preList contains preprints discussed during the 9th International Symposium on the Biology of Vertebrate Sex Determination. This conference was held in Kona, Hawaii from April 17th to 21st 2023.


List by Martin Estermann

Alumni picks – preLights 5th Birthday

This preList contains preprints that were picked and highlighted by preLights Alumni - an initiative that was set up to mark preLights 5th birthday. More entries will follow throughout February and March 2023.


List by Sergio Menchero et al.


The advances in fibroblast biology preList explores the recent discoveries and preprints of the fibroblast world. Get ready to immerse yourself with this list created for fibroblasts aficionados and lovers, and beyond. Here, my goal is to include preprints of fibroblast biology, heterogeneity, fate, extracellular matrix, behavior, topography, single-cell atlases, spatial transcriptomics, and their matrix!


List by Osvaldo Contreras

Single Cell Biology 2020

A list of preprints mentioned at the Wellcome Genome Campus Single Cell Biology 2020 meeting.


List by Alex Eve

Antimicrobials: Discovery, clinical use, and development of resistance

Preprints that describe the discovery of new antimicrobials and any improvements made regarding their clinical use. Includes preprints that detail the factors affecting antimicrobial selection and the development of antimicrobial resistance.


List by Zhang-He Goh