Fast and interpretable scRNA-seq data analysis

Published: Oct. 7, 2020, 2:03 a.m.

Link to bioRxiv paper: http://biorxiv.org/cgi/content/short/2020.10.05.314039v1?rss=1 Authors: Cobanoglu, M. C. Abstract: One of the key challenges in single-cell data analysis is the annotation of cells with their cell types. This task is divided into two different sub-tasks: identifying known cell types and identifying novel cell types. We propose that both of these problems can be solved by generative Bayesian Dirichlet-multinomial models. In the supervised learning context, we propose a generative Bayesian Dirichlet-multinomial classifier. In the unsupervised learning context, we propose a Bayesian Dirichlet-multinomial mixture model. We show that the proposed models learn meaningful models where the predicted cell types and the genes associated with them overlap with ground truth. Furthermore, there are no density or connectivity based clustering assumptions in this model, which differs with almost every approach in this field. Consequently the clustering results from the generative method can effectively represent nuanced differences among cells. Copy rights belong to original authors. Visit the link for more info