Link to bioRxiv paper: http://biorxiv.org/cgi/content/short/2020.04.24.058958v1?rss=1 Authors: Fulcher, B. D., Arnatkeviciute, A., Fornito, A. Abstract: The recent availability of whole-brain atlases of gene expression, which quantify the transcriptional activity of thousands of genes across many different brain regions, has opened new opportunities to understand how gene-expression patterns relate to spatially varying properties of brain structure and function. To aid interpretation of a given neural phenotype, gene-set enrichment analysis (GSEA) has become a standard statistical methodology to identify functionally related groups of genes, annotated using systems such as the Gene Ontology (GO), that are associated with a given phenotype. While GSEA has identified functional groups of genes related to diverse aspects of brain structure and function in mouse and human, here we show that these results are affected by substantial statistical biases. Quantifying the false-positive rates of individual GO categories across an ensemble of completely random phenotypic spatial maps, we found an average 875-fold inflation of significant findings relative to expectation in mouse, and a 582-fold inflation in human, with some categories being judged as significant for over 20% of random phenotypes. Concerningly, the probability of a GO category being reported as significant in the extant literature increases with its estimated false-positive rate, suggesting that published reports are strongly affected by the reporting of false-positive bias. We show that the bias is primarily driven by gene--gene coexpression and spatial autocorrelation in transcriptional data, which are not accounted for in conventional GSEA nulls, and we introduce flexible ensemble-based null models that properly account for these effects. Using case studies of structural connectivity degree in mouse and human, we demonstrate that many GO categories that would conventionally be judged as highly significant are in fact consistent with ensembles of random phenotypes. Our results highlight major pitfalls with applying standard GSEA to brain-wide transcriptomic data and outline a solution to this pervasive problem, which is made available as a toolbox. Copy rights belong to original authors. Visit the link for more info