HiCRep.py: Fast comparison of Hi-C contact matrices in Python

Published: Oct. 28, 2020, 9:02 a.m.

Link to bioRxiv paper: http://biorxiv.org/cgi/content/short/2020.10.27.357756v1?rss=1 Authors: Lin, D., Sanders, J., Noble, W. S. Abstract: Hi-C is the most widely used assay for investigating genome-wide 3D organization of chromatin. When working with Hi-C data, it is often useful to calculate the similarity between contact matrices in order to asses experimental reproducibility or to quantify relationships among Hi-C data from related samples. The HiCRep algorithm has been widely adopted for this task, but the existing R implementation suffers from run time limitations on high resolution Hi-C data or on large single-cell Hi-C datasets. We introduce a Python implementation of HiCRep and demonstrate that it is much faster than the existing R implementation. Furthermore, we give examples of HiCRep's ability to accurately distinguish replicates from non-replicates and to reveal cell type structure among collections of Hi-C data. HiCRep.py and its documentation are available with a GPL license at https://github.com/Noble-Lab/hicrep. The software may be installed automatically using the pip package installer. Copy rights belong to original authors. Visit the link for more info