Deducing high-accuracy protein contact-maps from a triplet of coevolutionary matrices through deep residual convolutional networks

Published: Oct. 7, 2020, 1:02 a.m.

Link to bioRxiv paper: http://biorxiv.org/cgi/content/short/2020.10.05.326140v1?rss=1 Authors: Li, Y., Zhang, C., Bell, E. W., Zheng, W., Zhou, X., Yu, D.-J., Zhang, Y. Abstract: The topology of protein folds can be specified by the inter-residue contact-maps and accurate contact-map prediction can help ab initio structure folding. We developed TripletRes to deduce protein contact-maps from discretized distance profiles by end-to-end training of deep residual neural-networks. Compared to previous approaches, the major advantage of TripletRes is in its ability to learn and directly fuse a triplet of coevolutionary matrices extracted from the whole-genome and metagenome databases and therefore minimize the information loss during the course of contact model training. TripletRes was tested on a large set of 245 non-homologous proteins from CASP and CAMEO experiments, and outperformed other state-of-the-art methods by at least 58.4% for the CASP 11&12 and 44.4% for the CAMEO targets in the top-L long-range contact precision. On the 31 FM targets from the latest CASP13 challenge, TripletRes achieved the highest precision (71.6%) for the top-L/5 long-range contact predictions. These results demonstrate a novel efficient approach to extend the power of deep convolutional networks for high-accuracy medium- and long-range protein contact-map predictions starting from primary sequences, which are critical for constructing 3D structure of proteins that lack homologous templates in the PDB library. Availability: The training and testing data, standalone package, and the online server for TripletRes are available at https://zhanglab.ccmb.med.umich.edu/TripletRes/. Copy rights belong to original authors. Visit the link for more info