Permutation tests for hypothesis testing with animal social data: problems and potential solutions

Published: Aug. 4, 2020, 6:01 p.m.

Link to bioRxiv paper: http://biorxiv.org/cgi/content/short/2020.08.02.232710v1?rss=1 Authors: Farine, D. R., Carter, G. G. Abstract: Generating insights about a null hypothesis requires not only a good dataset, but also statistical tests that are reliable and actually address the null hypothesis of interest. Recent studies have found that permutation tests, which are widely used to test hypotheses when working with animal social network data, can suffer from high rates of type I error (false positives) and type II error (false negatives). Here, we first outline why pre-network and node permutation tests have elevated type I and II error rates. We then propose a new procedure, the double permutation test, that addresses some of the limitations of existing approaches by combining pre-network and node permutations. We conduct a range of simulations, allowing us to estimate error rates under different scenarios, including errors caused by confounding effects of social or non-social structure in the raw data. We show that double permutation tests avoid elevated type I errors, while remaining sufficiently sensitive to avoid elevated type II errors. By contrast, the existing solutions we tested, including node permutations, pre-network permutations, and regression models with control variables, all exhibit elevated errors under at least one set of simulated conditions. Type I error rates from double permutation remain close to 5% in the same scenarios where type I error rates from pre-network permutation tests exceed 30%. The double permutation test provides a potential solution to issues arising from elevated type I and type II error rates when testing hypotheses with social network data. We also discuss other approaches, including restricted node permutations, testing multiple null hypotheses, and splitting large datasets to generate replicated networks, that can strengthen our ability to make robust inferences. Finally, we highlight ways that uncertainty can be explicitly considered during the analysis using permutation-based or Bayesian methods. Copy rights belong to original authors. Visit the link for more info