Link to bioRxiv paper: http://biorxiv.org/cgi/content/short/2020.09.03.282210v1?rss=1 Authors: Shen, J., Mei, J., Wallden, M., Ino, F. Abstract: FreeSurfer is among the most widely used suites of software for the study of cortical and subcortical brain anatomy. However, analysis using FreeSurfer can be time-consuming and it lacks support for the graphics processing units (GPUs) after the core development team stopped maintaining GPU-accelerated versions due to significant programming cost. As FreeSurfer is a large project with millions of source lines, in this work, we introduce and examine the use of a directive-based framework, OpenACC, in GPU acceleration of FreeSurfer, and we found the OpenACC-based approach significantly reduces programming costs. Moreover, because the overhead incurred by CPU-to-GPU data transfer is the major challenge in delivering GPU-based codes of high performance, we compare two schemes, copy-and-transfer and overlapped-fully-transfer, to reduce such data transfer overhead. Experimental results show that the target function we accelerated with overlapped-fully-transfer scheme ran 2.3x as fast as the original CPU-based function, and the GPU-accelerated program achieved an average speedup of 1.2x compared to the original CPU-based program. These results demonstrate the usefulness and potential of utilizing the proposed OpenACC-based approach to integrate GPU support for FreeSurfer which can be easily extended to other computationally expensive functions and modules of FreeSurfer to achieve further speedup. Copy rights belong to original authors. Visit the link for more info