Relative matching using low coverage sequencing

Published: Sept. 9, 2020, 4:01 a.m.

Link to bioRxiv paper: http://biorxiv.org/cgi/content/short/2020.09.09.289322v1?rss=1 Authors: Petter, E., Schweiger, R., Shahino, B., Shor, T., Aker, M., Almog, L., Weissglas-Volkov, D., Naveh, Y., Navon, O., Carmi, S., Li, J. H., Berisa, T., Pickrell, J. K., Erlich, Y. Abstract: Finding familial relatives using DNA have multiple applications, in genetic genealogy, population genetics, and forensics. So far, most relative matching algorithms rely on detecting identity-by-descent (IBD) segments with high quality genotype data. Recently, low coverage sequencing (LCS) has received growing attention as a promising cost-effective method to ascertain genomic information. However, with higher error rates, it is unclear whether existing IBD detection can work on LCS datasets. Here, we developed and tested a framework for relative matching using sequencing with 1x coverage (1xLCS). We started by exploring the error characteristics of this method compared to array data. Our results show that after some optimization 1xLCS can exhibit the same genotyping discordance rates as the discordance between two array platforms. Using this observation, we developed a hybrid framework for relative matching and tuned this framework with >2,700 pairs of confirmed genealogical relatives that were genotyped using heterogenous datasets. We then obtained array and 1xLCS on 19 samples and use our framework to find relatives in a database of over 3 million individuals. The total length of shared segments obtained by 1xLCS was virtually indistinguishable to genotyping arrays for matches with a total sharing >200cM (second cousins or closer). For more distant relatives, as long as those were detected by both technologies, the total length obtained by LCS and by genotyping arrays was highly correlated, with no evidence of over- or underestimation. Taken together, our results show that 1xLCS can be a valid alternative to arrays for relative matching, opening the possibility for further democratization of genomic data. Copy rights belong to original authors. Visit the link for more info