Phylogeny of the COVID-19 Virus SARS-CoV-2 by Compression

Published: July 23, 2020, 9:01 p.m.

Link to bioRxiv paper: http://biorxiv.org/cgi/content/short/2020.07.22.216242v1?rss=1 Authors: Vitanyi, P. M. B., Cilibrasi, R. L. Abstract: We analyze the phylogeny and taxonomy of the SARS-CoV-2 virus using compression. This is a new alignment-free method called the "normalized compression distance" (NCD) method. It discovers all effective similarities based on Kolmogorov complexity. The latter being incomputable we approximate it by a good compressor such as the modern zpaq. The results comprise that the SARS-CoV-2 virus is closest to the RaTG13 virus and similar to two bat SARS-like coronaviruses bat-SL-CoVZXC21 and bat-SL-CoVZC4. The similarity is quantified and compared with the same quantified similarities among the mtDNA of certain species. We treat the question whether Pangolins are involved in the SARS-CoV-2 virus. Copy rights belong to original authors. Visit the link for more info