Identifying Taxonomic Units in Metagenomic DNA Streams

Published: Aug. 23, 2020, 5:01 a.m.

Link to bioRxiv paper: http://biorxiv.org/cgi/content/short/2020.08.21.261313v1?rss=1 Authors: Zola, J., Zheng, V., Sariyuce, A. E. Abstract: With the emergence of portable DNA sequencers, such as Oxford Nanopore Technology MinION, metagenomic DNA sequencing can be performed in real-time and directly in the field. However, because metagenomic DNA analysis is computationally and memory intensive, and the current methods are designed for batch processing, the current metagenomic tools are not well suited for mobile~devices. In this paper, we propose a new memory-efficient method to identify Operational Taxonomic Units (OTUs) in metagenomic DNA streams. Our method is based on finding connected components in overlap graphs constructed over a real-time stream of long DNA reads as produced by MinION platform. We propose an efficient algorithm to maintain connected components when an overlap graph is streamed, and show how redundant information can be removed from the stream by transitive closures. Through experiments on simulated and real-world metagenomic data, we demonstrate that the resulting solution is able to recover OTUs with high precision while remaining suitable for mobile computing devices. Copy rights belong to original authors. Visit the link for more info