by Pierre J. C. Chuard, Milan Vrtílek, Megan L. Head, Michael D. Jennions
There is increased concern about poor scientific practices arising from an excessive focus on P-values. Two particularly worrisome practices are selective reporting of significant results and ‘P-hacking’. The latter is the manipulation of data collection, usage, or analyses to obtain statistically significant outcomes. Here, we introduce the novel, to our knowledge, concepts of selective reporting of nonsignificant results and ‘reverse P-hacking’ whereby researchers ensure that tests produce a nonsignificant result. We test whether these practices occur in experiments in which researchers randomly assign subjects to treatment and control groups to minimise differences in confounding variables that might affect the focal outcome. By chance alone, 5% of tests for a group difference in confounding variables should yield a significant result (P < 0.05). If researchers less often report significant findings and/or reverse P-hack to avoid significant outcomes that undermine the ethos that experimental and control groups only differ with respect to actively manipulated variables, we expect significant results from tests for group differences to be under-represented in the literature. We surveyed the behavioural ecology literature and found significantly more nonsignificant P-values reported for tests of group differences in potentially confounding variables than the expected 95% (P = 0.005; N = 250 studies). This novel, to our knowledge, publication bias could result from selective reporting of nonsignificant results and/or from reverse P-hacking. We encourage others to test for a bias toward publishing nonsignificant results in the equivalent context in their own research discipline.