Venice: A new algorithm for finding marker genes in single-cell transcriptomic data

Published: Nov. 17, 2020, 6:03 p.m.

Link to bioRxiv paper: http://biorxiv.org/cgi/content/short/2020.11.16.384479v1?rss=1 Authors: Vuong, H., Truong, T., Phan, T., Pham, S. Abstract: Most widely used tools for finding marker genes in single-cell data (SeuratT/NegBinom/Poisson, CellRanger, EdgeR, limmatrend) use a conventional definition of differentially expressed genes: genes with different mean expression values. However, in single-cell data, a cell population can be a mixture of many cell types/cell states, hence the mean expression of genes cannot represent the whole population. In addition, these tools assume that the gene expression of a population belongs to a specific family of distribution. This assumption is often violated in single-cell data. In this work, we define marker genes of a cell population as genes that can be used to distinguish cells in the population from cells in other populations. Besides log-fold change, we devise a new metric to classify genes into up-regulated, down-regulated, and transitional states. In a benchmark for finding up-regulated and down-regulated genes, our tool outperforms all compared methods, including Seurat, ROTS, scDD, edgeR, MAST, limma, normal t test, Wilcoxon and KolmogorovSmirnov test. Our method is much faster than all compared methods, therefore, enables interactive analysis for large single cell data sets in BioTuring Browser. Venice algorithm is available within Signac package: https://github.com/bioturing/signac Copy rights belong to original authors. Visit the link for more info