Protein sequence design by explicit energy landscape optimization

Published: July 24, 2020, 9:08 p.m.

Link to bioRxiv paper: http://biorxiv.org/cgi/content/short/2020.07.23.218917v1?rss=1 Authors: Norn, C., Wicky, B. I. M., Juergens, D., Liu, S., Kim, D., Koepnick, B., Anishchenko, I., Foldit Players,, Baker, D., Ovchinnikov, S. Abstract: The protein design problem is to identify an amino acid sequence which folds to a desired structure. Given Anfinsen's thermodynamic hypothesis of folding, this can be recast as finding an amino acid sequence for which the lowest energy conformation is that structure. As this calculation involves not only all possible amino acid sequences but also all possible structures, most current approaches focus instead on the more tractable problem of finding the lowest energy amino acid sequence for the desired structure, often checking by protein structure prediction in a second step that the desired structure is indeed the lowest energy conformation for the designed sequence, and discarding the in many cases large fraction of designed sequences for which this is not the case. Here we show that by backpropagating gradients through the trRosetta structure prediction network from the desired structure to the input amino acid sequence, we can directly optimize over all possible amino acid sequences and all possible structures, and in one calculation explicitly design amino acid sequences predicted to fold into the desired structure and not any other. We find that trRosetta calculations, which consider the full conformational landscape, can be more effective than Rosetta single point energy estimations in predicting folding and stability of de novo designed proteins. We compare sequence design by landscape optimization to the standard fixed backbone sequence design methodology in Rosetta, and show that the results of the former, but not the latter, are sensitive to the presence of competing low-lying states. We show further that more funneled energy landscapes can be designed by combining the strengths of the two approaches: the low resolution trRosetta model serves to disfavor alternative states, and the high resolution Rosetta model, to create a deep energy minimum at the design target structure. Copy rights belong to original authors. Visit the link for more info