Analysis of Variance (ANOVA)
ANOVA is used to compare the means of three or more samples.
While you could do multiple T-tests, as you increase the numbers of T-tests you do, you are more likely to encounter a Type I error. If you have a p value of 0.05 for each T-test, once you have run three T-tests, your p is effectively 0.143. ANOVA controls for this error such that the Type I error rate remains at 5%.
An ANOVA will provide an F-statistic which can, along with degrees of freedom, be used to calculate a p value.
ANOVAs assume independence of observations, homogeneity of variances and normally distributed observations within groups.
This is implemented in scipy by as
We will use R’s Plant Growth Data Set for our ANOVA.
The null hypothesis is that there is no difference between the means of the weights of dried plants under control and 2 different treatment conditions.