Neutralise: an open science initiative for neutral comparison of two-sample tests
Author(s): Leyla KODALCI,Olivier Thas
Affiliation(s): I-BioStat, Data Science Institute, Hasselt University, Agoralaan Gebouw D, B-3590 Diepenbeek, Belgium. Department of Applied Mathematics, Computer Science and Statistics, Ghent University, Krijgslaan 281, B-9000 Gent, Belgium. National Institute of Applied Statistics Research Australia (NIASRA), University of Wollongong, Northfields Avenue, NSW 2522, Australia
The two-sample problem is one of the earliest problems in statistics: given two samples, the question is whether or not the observations were sampled from the same distribution. Many statistical tests have been developed for this problem, and many tests have been evaluated in simulation studies, but hardly any study has tried to set up a neutral comparison study. We introduce an open science initiative that potentially allows for neutral comparisons of two-sample tests. Our initiative makes use of an open-source R package and a repository on GitHub. The central idea is that everyone can submit a new method and/or a new simulation scenario, and the system evaluates (1) the new method on all previously submitted simulation scenarios, and (2) all previously submitted methods on the new scenarios. The simulation results can be posted on the public GitHub repository. In principle this framework can be tailored to any class of statistical hypothesis tests, but here we present the method for the evaluation of hypothesis tests for the two-sample problem, which is concerned with the null hypothesis that the distribution of an outcome is the same in two populations. The R package also contains functions for visualising and summarising the simulation results. These functions are also implemented in an R Shiny app that is run from an R Shiny server (https://dsi-uhasselt.shinyapps.io/Neutralise/).