LongGeneDB: a data hub for long genes

Published: Sept. 9, 2020, 6:02 a.m.

Link to bioRxiv paper: http://biorxiv.org/cgi/content/short/2020.09.08.281220v1?rss=1 Authors: Kim, Y., Naghavi, M., Zhao, J. Y. Abstract: The human genome contains more than 4000 genes that are longer than 100 kb. These long genes require more time and resources to make a transcript than shorter genes do. Long genes have also been linked to various human diseases. Specific mechanisms are utilized by long genes to facilitate their transcription and co-transcriptional processes. This results in unique features in their multi-omics profiles. Although these unique profiles are important to understand long genes, a database that provides an integrated view and easy access to the multi-omics profiles of long genes does not exist. We leveraged the publicly accessible multi-omics data and systematically analyzed the genomic conservation, histone modifications, chromatin organization, tissue-specific transcriptome, and single cell transcriptome of 992 protein-coding genes that are longer than 200 kb in the mouse genome. We also examined the evolution history of their gene lengths in 15 species that belong to six Classes and 11 Orders. To share the multi-omics profiles of long genes, we developed a user-friendly and easy-to- use database, LongGeneDB (https://longgenedb.com), for users to search, browse, and download these profiles. LongGeneDB will be a useful data hub for the biomedical research community to understand long genes. Copy rights belong to original authors. Visit the link for more info