accuEnhancer: Accurate enhancer prediction by integration of multiple cell type data with deep learning

Published: Nov. 11, 2020, 9:04 p.m.

Link to bioRxiv paper: http://biorxiv.org/cgi/content/short/2020.11.10.375717v1?rss=1 Authors: Tung, Y.-A., Yang, W.-T., Hsieh, T.-T., Chang, Y.-C., Wu, J.-T., Oyang, Y.-J., Chen, C.-Y. Abstract: Enhancers are one class of the regulatory elements that have been shown to act as key components to assist promoters in modulating the gene expression in living cells. At present, the number of enhancers as well as their activities in different cell types are still largely unclear. Previous studies have shown that enhancer activities are associated with various functional data, such as histone modifications, sequence motifs, and chromatin accessibilities. In this study, we utilized DNase data to build a deep learning model for predicting the H3K27ac peaks as the active enhancers in a target cell type. We propose joint training of multiple cell types to boost the model performance in predicting the enhancer activities of an unstudied cell type. The results demonstrated that by incorporating more datasets across different cell types, the complex regulatory patterns could be captured by deep learning models and the prediction accuracy can be largely improved. The analyses conducted in this study demonstrated that the cell type-specific enhancer activity can be predicted by joint learning of multiple cell type data using only DNase data and the primitive sequences as the input features. This reveals the importance of cross-cell type learning, and the constructed model can be applied to investigate potential active enhancers of a novel cell type which does not have the H3K27ac modification data yet. Availability: The accuEnhancer package can be freely accessed at: https://github.com/callsobing/accuEnhancer Copy rights belong to original authors. Visit the link for more info