Last week I’ve submitted a paper (written with EJ Wagenmakers) entitled “Hidden multiplicity in exploratory multiway ANOVA: Prevalence and remedies”. In a nutshell, we argue that without a-priori hypotheses, using a multiway ANOVA becomes an exploratory expedition in which the family of hypotheses comprises all hypotheses subject to test; and as such, a multiway ANOVA harbors a multiple comparison problem. For example, in the case of two factors, three separate null hypotheses are tested (i.e., two main effects and one interaction). Consequently, the probability of at least one Type I error (if all null hypotheses are true) is 14% rather than 5% if the three tests are independent. Statisticians are well aware of this problem.
However, many psychology researchers do not realize this lurking multiplicity problem and as a result, almost never correct their alpha’s when using multiway ANOVA in an exploratory fashion. We show this for 819 papers in six widely read and cited psychology journals: Journal of Experimental Pyschology General, Psychological Science, Journal of Abnormal Psychology, Journal of Consulting and Clinical Psychology, Journal of Experimental and Social Psychology and Journal of Personality and Social Psychology. In almost 50% of these papers, a multiway ANOVA was the main statistical analysis, underscoring the popularity of this testing procedure. Unfortunately, of these papers, only around 1% used a correction procedure (i.e., the omnibus test).
Fortunately, there are quite some ways to remedy this multiplicity problem as we outline in the paper. First, one could use an omnibus F-test: in such a test, one pools the sums of squares and degrees of freedom for all main effects and interactions into a single F statistic. The individual F tests should only be conducted if this omnibus null hypothesis is rejected. A major drawback of this method is that is does not control the familywise Type I error under partial null conditions; and as such, the method offers only weak protection against the multiplicity problem. Second, one could opt for controlling the family-wise error rate (FWER), for example by using the sequential Bonferroni correction method. While adequately controlling the Type I error, the downside is that this method reduces power. Third, one could choose to control the false discovery rate (FDR) instead, for example with the Benjamini-Hochberg correction method. This method results in more power compared to sequential Bonferroni but at the expense of less control of the Type I error. Finally, preregistering hypotheses (for example at the Open Science Framework) forces the researcher to specify the specific hypotheses of interest beforehand. In that case, using the multiway ANOVA becomes a confirmatory expedition and this potentially mitigates the multiple comparison problem. For example, consider experimental data to be analyzed with a two-way ANOVA: if the researcher stipulates in advance that the interest lies solely in the interaction, this reduces the number of tested hypotheses in the family from three to one, thereby diminishing the multiplicity problem.
The latest version of this paper can be found here.