A cloud-based platform for the analysis of single cell RNA sequencing data.

Published: Sept. 29, 2020, 7:01 p.m.

Link to bioRxiv paper: http://biorxiv.org/cgi/content/short/2020.09.28.317719v1?rss=1 Authors: Joshy, N., Yun, K. Abstract: Motivation: Single-cell RNA sequencing (scRNA-seq) is a recent technology that has provided many valuable biological insights. Notable uses include identifying novel cell-types, measuring the cellular response to treatment, and tracking trajectories of distinct cell lineages in time. The raw data generated in this process typically amounts to hundreds of millions of sequencing reads and requires substantial computational infrastructure for downstream analysis, a major hurdle for a biological research lab. Fortunately, the preprocessing step that converts this huge sequence data into manageable cell-specific expression profiles is standardized and can be performed in the cloud. We demonstrate how a cloud-based computational framework can be used to transform the raw data into biologically interpretable cell-type-specific information, using either 3 or 5 transcriptome libraries from 10x Genomics. The processed data which is an order of magnitude smaller in size can be easily downloaded to a laptop for customized analysis to gain deeper biological insights. Results: We produced an automated and easily extensible pipeline in the cloud for the analysis of single-cell RNA-seq data which provides a convenient method to handle post-processing of scRNA sequencing using next generation sequencing platforms. The basic step provides the transformation of the scRNA-seq data to cell-type-specific expression profiles and computes the quality control metrics for the dataset. The extensibility of the platform is demonstrated by adding a doublet-removal algorithm and recomputing the clustering of the cells. Any additional computational steps that take a cell-type expression counts matrix as input can be easily added to this framework with minimal effort. Availability: The framework and its documentation for installation is available at the Github repository http://github.com/nj3252/CB-Source/ Copy rights belong to original authors. Visit the link for more info