Reconfiguring Okazaki fragment start sites on a genome by using a data driven approach

Published: Sept. 30, 2020, 2:01 p.m.

Link to bioRxiv paper: http://biorxiv.org/cgi/content/short/2020.09.29.317842v1?rss=1 Authors: Soffer, A., Ifrach, M., Ilic, S., Afek, A., Vilenchik, D., Akabayov, B. Abstract: DNA primase is an essential enzyme that synthesizes short RNA primers on specific DNA sequences. These RNA primers are elongated by DNA polymerase to form Okazaki fragments on the lagging DNA strand. It is therefore reasonable to as-sume that the binding of DNA primase on a genome marks the start sites of the Okazaki fragments. It has long been known that the frequency of the occurrence of primase trinucleotide recognition on a genome sequence has no influence on the size of the Okazaki fragments. The unresolved enigma that we address in this study is therefore why some, but not all, primase-DNA recognition sequences (PDRSs) become Okazaki fragment start sites. To this end, we applied machine-learning algo-rithms to analyze a massive amount of data obtained from protein-DNA binding microarrays (PBM) with the aim of identi-fying the essential elements on DNA that are needed for the binding of bacteriophage T7 primase. A PBM data learning al-gorithm enabled the prediction of binding values of T7 primase for any given DNA sequence with unprecedented accuracy and flexibility. On the basis of the principles learned about DNA-primase binding, we generated novel DNA sequences with improved binding of T7 primase and improved RNA primer synthesis, as validated experimentally. Copy rights belong to original authors. Visit the link for more info