nPhase: An accurate and contiguous phasing method for polyploids

Published: July 24, 2020, 7:52 p.m.

Link to bioRxiv paper: http://biorxiv.org/cgi/content/short/2020.07.24.219105v1?rss=1 Authors: Abou Saada, O., Tsouris, A., Friedrich, A., Schacherer, J. Abstract: While genome sequencing and assembly are now routine, we still do not have a full and precise picture of polyploid genomes. Phasing these genomes, i.e. deducing haplotypes from genomic data, remains a challenge. Despite numerous attempts, no existing polyploid phasing method provides accurate and contiguous haplotype predictions. To address this need, we developed nPhase, a ploidy agnostic pipeline and algorithm that leverage the accuracy of short reads and the length of long reads to solve reference alignment-based phasing for samples of unspecified ploidy (https://github.com/nPhasePipeline/nPhase). nPhase was validated on virtually constructed polyploid genomes of the model species Saccharomyces cerevisiae, generated by combining sequencing data of homozygous isolates. nPhase obtained on average >95% accuracy and a contiguous 1.25 haplotigs per haplotype to cover >90% of each chromosome (heterozygosity rate [≥]0.5%). This new phasing method opens the door to explore polyploid genomes through applications such as population genomics and hybrid studies. Copy rights belong to original authors. Visit the link for more info