The two-way ANOVA (evaluation of variance) is a statistical method that permits to **evaluate the simultaneous effect of two ****categorical**** variables on a ****quantitative continuous**** variable**.

The 2-way ANOVA is an extension of the one-way ANOVA because it allows to judge the results on a numerical response of **two** categorical variables as an alternative of 1. The advantage of a two-way ANOVA over a one-way ANOVA is that we test the connection between two variables, while making an allowance for the effect of a 3rd variable. Furthermore, it also allows to incorporate the possible *interaction* of the 2 categorical variables on the response.

The advantage of a two-way over a one-way ANOVA is sort of much like the advantage of a correlation over a multiple linear regression:

- The correlation measures the connection between two quantitative variables. The multiple linear regression also measures the connection between two variables, but this time making an allowance for the potential effect of other covariates.
- The one-way ANOVA tests whether a quantitative variable is different between groups. The 2-way ANOVA also tests whether a quantitative variable is different between groups, but this time making an allowance for the effect of one other qualitative variable.

Previously, now we have discussed about one-way ANOVA in R. Now, we show when, why, and how one can perform a **two-way** ANOVA in R.

Before going further, I would love to say and briefly describe some related statistical methods and tests as a way to avoid any confusion:

A Student’s t-test is used to judge the effect of 1 categorical variable on a quantitative continuous variable, **when the explicit variable has exactly 2 levels**:

- Student’s t-test
*for independent samples*if the observations are**independent**(for instance: if we compare the age between ladies and men) - Student’s t-test
*for paired samples*if the observations are**dependent**, that’s, after they are available pairs (it’s the case when the identical subjects are measured twice, at two different cut-off dates, before and after a treatment for instance)

To judge the effect of 1 categorical variable on a quantitative variable, **when the explicit variable has 3 or more levels**:1

*one-way ANOVA*(often simply referred as ANOVA) if the groups are**independent**(for instance a gaggle of patients who received treatment A, one other group of patients who received treatment B, and the last group of patients who received no treatment or a placebo)*repeated measures ANOVA*if the groups are**dependent**(when the identical subjects are measured 3 times, at three different cut-off dates, before, during and after a treatment for instance)

A two-way ANOVA is used to judge the results of two categorical variables (and their potential interaction) on a quantitative continuous variable. That is the subject of the post.

Linear regression is used to judge the connection between a quantitative continuous dependent variable and one or several independent variables:

- easy linear regression if there is just one independent variable (which will be quantitative or qualitative)
- multiple linear regression if there’s not less than two independent variables (which will be quantitative, qualitative, or a mixture of each)

An ANCOVA (evaluation of covariance) is used to judge the effect of a categorical variable on a quantitative variable, while controlling for the effect of one other quantitative variable (generally known as covariate). ANCOVA is definitely a special case of multiple linear regression with a mixture of 1 qualitative and one quantitative independent variable.

On this post, we start by explaining when and why a two-way ANOVA is beneficial, we then do some preliminary descriptive analyses and present how one can conduct a two-way ANOVA in R. Finally, we show how one can interpret and visualize the outcomes. We also briefly mention and illustrate how one can confirm the underlying assumptions.

As an example how one can perform a two-way ANOVA in R, we use the `penguins`

dataset, available from the `{palmerpenguins}`

package.

We don’t have to import the dataset, but we want to load the package first after which call the dataset:

`# install.packages("palmerpenguins")`

library(palmerpenguins)

`dat <- penguins # rename dataset`

str(dat) # structure of dataset

`## tibble [344 × 8] (S3: tbl_df/tbl/data.frame)`

## $ species : Factor w/ 3 levels "Adelie","Chinstrap",..: 1 1 1 1 1 1 1 1 1 1 ...

## $ island : Factor w/ 3 levels "Biscoe","Dream",..: 3 3 3 3 3 3 3 3 3 3 ...

## $ bill_length_mm : num [1:344] 39.1 39.5 40.3 NA 36.7 39.3 38.9 39.2 34.1 42 ...

## $ bill_depth_mm : num [1:344] 18.7 17.4 18 NA 19.3 20.6 17.8 19.6 18.1 20.2 ...

## $ flipper_length_mm: int [1:344] 181 186 195 NA 193 190 181 195 193 190 ...

## $ body_mass_g : int [1:344] 3750 3800 3250 NA 3450 3650 3625 4675 3475 4250 ...

## $ sex : Factor w/ 2 levels "female","male": 2 1 1 NA 1 2 1 2 NA NA ...

## $ yr : int [1:344] 2007 2007 2007 2007 2007 2007 2007 2007 2007 2007 ...

The dataset comprises 8 variables for 344 penguins, summarized below:

`summary(dat)`

`## species island bill_length_mm bill_depth_mm `

## Adelie :152 Biscoe :168 Min. :32.10 Min. :13.10

## Chinstrap: 68 Dream :124 1st Qu.:39.23 1st Qu.:15.60

## Gentoo :124 Torgersen: 52 Median :44.45 Median :17.30

## Mean :43.92 Mean :17.15

## third Qu.:48.50 third Qu.:18.70

## Max. :59.60 Max. :21.50

## NA's :2 NA's :2

## flipper_length_mm body_mass_g sex yr

## Min. :172.0 Min. :2700 female:165 Min. :2007

## 1st Qu.:190.0 1st Qu.:3550 male :168 1st Qu.:2007

## Median :197.0 Median :4050 NA's : 11 Median :2008

## Mean :200.9 Mean :4202 Mean :2008

## third Qu.:213.0 third Qu.:4750 third Qu.:2009

## Max. :231.0 Max. :6300 Max. :2009

## NA's :2 NA's :2

On this post, we are going to concentrate on the next three variables:

`species`

: the species of the penguin (Adelie, Chinstrap or Gentoo)`sex`

: sex of the penguin (female and male)`body_mass_g`

: body mass of the penguin (in grams)

If needed, more details about this dataset will be found by running `?penguins`

in R.

`body_mass_g`

is the quantitative continuous variable and will likely be the dependent variable, whereas `species`

and `sex`

are each qualitative variables.

Those two last variables will likely be our independent variables, also referred as aspects. Make certain that they’re read as aspects by R. If it is just not the case, they may must be transformed to aspects.

As mentioned above, a two-way ANOVA is used to **evaluate concurrently the effect of two categorical variables on one quantitative continuous variable**.

It’s referred as **two**-way ANOVA because we’re comparing groups that are formed by **two** independent categorical variables.

Here, we would love to know if body mass depends upon species and/or sex. Particularly, we’re involved in:

- measuring and testing the connection between species and body mass,
- measuring and testing the connection between sex and body mass, and
- potentially check whether the connection between species and body mass is different for females and males (which is equivalent than checking whether the connection between sex and body mass depends upon the species)

The primary two relationships are referred as **primary effects**, while the third point is generally known as the **interaction effect**.

The primary effects test whether not less than one group is different from one other one (while controlling for the opposite independent variable). Then again, the interaction effect goals at testing whether the connection between two variables differs *depending on the extent of a 3rd variable*.

When performing a two-way ANOVA, testing the interaction effect is just not mandatory. Nevertheless, omitting an interaction effect may result in erroneous conclusions if the interaction effect is present.

If we return to our example, now we have the next hypothesis tests:

Important effect of sex on body mass:

- H0: mean body mass is equal between females and males
- H1: mean body mass is different between females and males

Important effect of species on body mass:

- H0: mean body mass is equal between all 3 species
- H1: mean body mass is different for not less than one species

Interaction between sex and species:

- H0: there is no such thing as a interaction between sex and species, meaning that the connection between species and body mass is identical for females and males (similarly, the connection between sex and body mass is identical for all 3 species)
- H1: there’s an interaction between sex and species, meaning that the connection between species and body mass is different for females than for males (similarly, the connection between sex and body mass depends upon the species)

Most statistical tests require some assumptions for the outcomes to be valid, and a two-way ANOVA is just not an exception.

Assumptions of a two-way ANOVA are similar than for a one-way ANOVA. To summarize:

**Variable type**: the dependent variable should be quantitative continuous, while the 2 independent variables should be categorical (with not less than two levels).**Independence**: the observations must be independent between groups and inside each group.**Normality**:- For small samples, data should follow roughly a traditional distribution
- For giant samples (often n ≥ 30 in each group/sample), normality is just not required (because of the central limit theorem)
**Equality of variances**: variances must be equal across groups.**Outliers**: There must be no significant outliers in any group.

More details about these assumptions will be present in the assumptions of a one-way ANOVA.

Now that now we have seen the underlying assumptions of the two-way ANOVA, we review them specifically for our dataset before applying the test and interpreting the outcomes.

The dependent variable body mass is quantitative continuous, while each independent variables sex and species are qualitative variables (with not less than 2 levels).

Due to this fact, this assumption is met.

Independence is normally checked based on the design of the experiment and the way data have been collected.

To maintain it easy, observations are often:

**independent**if each experimental unit (here a penguin) has been measured just once and the observations are collected from a representative and randomly chosen portion of the population, or**dependent**if each experimental unit has been measured not less than twice (because it is usually the case within the medical field for instance, with two measurements on the identical subjects; one before and one after the treatment).

In our case, body mass has been measured just once on each penguin, and on a representative and random sample of the population, so the independence assumption is met.

We’ve a big sample in all subgroups (each combination of the degrees of the 2 aspects, called cell):

`table(dat$species, dat$sex)`

`## `

## female male

## Adelie 73 73

## Chinstrap 34 34

## Gentoo 58 61

so normality doesn’t must be checked.

For completeness, we still show how one can confirm normality, as if we had a small samples.

There are several methods to check the normality assumption. Probably the most common methods being:

- a QQ-plot by group or on the residuals, and/or
- a histogram by group or on the residuals, and/or
- a normality test (Shapiro-Wilk test for example) by group or on the residuals.

The best/shortest way is to confirm the normality with a QQ-plot on the residuals. To attract this plot, we first need to avoid wasting the model:

`# save model`

mod <- aov(body_mass_g ~ sex * species,

data = dat

)

This piece of code will likely be explained further.

Now we will draw the QQ-plot on the residuals. We show two ways to achieve this, first with the `plot()`

function and second with the `qqPlot()`

function from the `{automobile}`

package:

`# method 1`

plot(mod, which = 2)

`# method 2`

library(automobile)

`qqPlot(mod$residuals,`

id = FALSE # remove point identification

)

Code for method 1 is barely shorter, nevertheless it misses the boldness interval across the reference line.

If points follow the straight line (called Henry’s line) and fall inside the boldness band, we will assume normality. That is the case here.

When you prefer to confirm the normality based on a histogram of the residuals, here is the code:

`# histogram`

hist(mod$residuals)

The histogram of the residuals show a gaussian distribution, which is consistent with the conclusion from the QQ-plot.

Although the QQ-plot and histogram is basically enough to confirm the normality, if you would like to test it more formally with a statistical test, the Shapiro-Wilk test will be applied on the residuals as well:

`# normality test`

shapiro.test(mod$residuals)

`## `

## Shapiro-Wilk normality test

##

## data: mod$residuals

## W = 0.99776, p-value = 0.9367

⇒ We don’t reject the null hypothesis that the residuals follow a traditional distribution (p-value = 0.937).

From the QQ-plot, histogram and Shapiro-Wilk test, we conclude that we don’t reject the null hypothesis of normality of the residuals.

The normality assumption is thus verified, we will now check the equality of the variances.2

Equality of variances, also referred as homogeneity of variances or homoscedasticity, will be verified visually with the `plot()`

function:

`plot(mod, which = 3)`

For the reason that spread of the residuals is constant, the red smooth line is horizontal and flat, so it looks just like the constant variance assumption is satisfied here.

The diagnostic plot above is sufficient, but if you happen to prefer it may possibly even be tested more formally with the Levene’s test (also from the `{automobile}`

package):3

`leveneTest(mod)`

`## Levene's Test for Homogeneity of Variance (center = median)`

## Df F value Pr(>F)

## group 5 1.3908 0.2272

## 327

⇒ We don’t reject the null hypothesis that the variances are equal (p-value = 0.227).

Each the visual and formal approaches give the identical conclusion; we don’t reject the hypothesis of homogeneity of the variances.

The best and commonest solution to detect outliers is visually because of boxplots by groups.

For females and males:

`library(ggplot2)`

`# boxplots by sex`

ggplot(dat) +

aes(x = sex, y = body_mass_g) +

geom_boxplot()

For the three species:

`# boxplots by species`

ggplot(dat) +

aes(x = species, y = body_mass_g) +

geom_boxplot()

There are, as defined by the interquartile range criterion, two outliers for the species Chinstrap. These points are, nonetheless, not extreme enough to bias results.

Due to this fact, we consider that the belief of no significant outliers is met.

We’ve shown that every one assumptions are met, so we will now proceed to the implementation of the two-way ANOVA in R.

It will allow us to reply the next research questions:

- Controlling for the species, is body mass significantly different between the 2 sexes?
- Controlling for the sex, is body mass significantly different for not less than one species?
- Is the connection between species and body mass different between female and male penguins?

Before performing any statistical test, it’s an excellent practice to make some descriptive statistics as a way to have a primary overview of the info, and maybe, have a glimpse of the outcomes to be expected.

This will be done via descriptive statistics or plots.

If we would like to maintain it easy, we will compute only the mean for every subgroup:

`# mean by group`

aggregate(body_mass_g ~ species + sex,

data = dat,

FUN = mean

)

`## species sex body_mass_g`

## 1 Adelie female 3368.836

## 2 Chinstrap female 3527.206

## 3 Gentoo female 4679.741

## 4 Adelie male 4043.493

## 5 Chinstrap male 3938.971

## 6 Gentoo male 5484.836

Or eventually, the mean and standard deviation for every subgroup using the `{dplyr}`

package:

`# mean and sd by group`

library(dplyr)

`group_by(dat, sex, species) %>%`

summarise(

mean = round(mean(body_mass_g, na.rm = TRUE)),

sd = round(sd(body_mass_g, na.rm = TRUE))

)

`## # A tibble: 8 × 4`

## # Groups: sex [3]

## sex species mean sd

##

## 1 female Adelie 3369 269

## 2 female Chinstrap 3527 285

## 3 female Gentoo 4680 282

## 4 male Adelie 4043 347

## 5 male Chinstrap 3939 362

## 6 male Gentoo 5485 313

## 7 Adelie 3540 477

## 8 Gentoo 4588 338

When you are a frequent reader of the blog, you already know that I wish to draw plots to visualise the info at hand before interpreting results of a test.

Probably the most appropriate plot when now we have one quantitative and two qualitative variables is a boxplot by group. This could easily be made with the `{ggplot2}`

package:

`# boxplot by group`

library(ggplot2)

`ggplot(dat) +`

aes(x = species, y = body_mass_g, fill = sex) +

geom_boxplot()

Some observations are missing for the sex, we will remove them to have a more concise plot:

`dat %>%`

filter(!is.na(sex)) %>%

ggplot() +

aes(x = species, y = body_mass_g, fill = sex) +

geom_boxplot()

Note that we could even have made the next plot:

`dat %>%`

filter(!is.na(sex)) %>%

ggplot() +

aes(x = sex, y = body_mass_g, fill = species) +

geom_boxplot()

But for a more readable plot, I are inclined to prefer putting the variable with the smallest variety of levels as color (which is in truth the argument `fill`

within the `aes()`

layer) and the variable with the biggest variety of categories on the x-axis (i.e., the argument `x`

within the `aes()`

layer).

From the means and the boxplots by subgroup, we will already see that, *in our sample*:

- female penguins are inclined to have a lower body mass than males, and that’s the case for all of the considered species, and
- body mass is higher for Gentoo penguins than for the opposite two species.

Keep in mind that these conclusions are only valid inside our sample! To generalize these conclusions to the population, we want to perform the two-way ANOVA and check the importance of the explanatory variables. That is the aim of the subsequent section.

As mentioned earlier, including an interaction effect in a two-way ANOVA is just not compulsory. Nevertheless, as a way to avoid flawed conclusions, it is strongly recommended to first check whether the interaction is important or not, and depending on the outcomes, include it or not.

If the interaction is just not significant, it’s secure to remove it from the ultimate model. Quite the opposite, if the interaction is important, it must be included in the ultimate model which will likely be used to interpret results.

We thus start with a model which incorporates the 2 primary effects (i.e., sex and species) and the interaction:

`# Two-way ANOVA with interaction`

# save model

mod <- aov(body_mass_g ~ sex * species,

data = dat

)

`# print results`

summary(mod)

`## Df Sum Sq Mean Sq F value Pr(>F) `

## sex 1 38878897 38878897 406.145 < 2e-16 ***

## species 2 143401584 71700792 749.016 < 2e-16 ***

## sex:species 2 1676557 838278 8.757 0.000197 ***

## Residuals 327 31302628 95727

## ---

## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

## 11 observations deleted attributable to missingness

The sum of squares (column `Sum Sq`

) shows that the species explain a big a part of the variability of body mass. It’s an important think about explaining this variability.

The p-values are displayed within the last column of the output above (`Pr(>F)`

). From these p-values, we conclude that, on the 5% significance level:

- controlling for the species, body mass is significantly different between the 2 sexes,
- controlling for the sex, body mass is significantly different for not less than one species, and
- the interaction between sex and species (displayed at the road
`sex:species`

within the output above) is important.

So from the numerous interaction effect, now we have just seen that the connection between body mass and species is different between men and women. Because it is important, now we have to maintain it within the model and we must always interpret results from that model.

If, quite the opposite, the interaction was not significant (that’s, if the p-value ≥ 0.05) we might have removed this interaction effect from the model. For illustrative purposes, below the code for a two-way ANOVA without interaction, referred as an additive model:

`# Two-way ANOVA without interaction`

aov(body_mass_g ~ sex + species,

data = dat

)

For the readers who’re used to perform linear regressions in R, you’ll notice that the structure of the code for a two-way ANOVA is in truth similar:

- the formula is
`dependent variable ~ independent variables`

- the
`+`

sign is used to incorporate independent variables*without*an interaction4 - the
`*`

sign is used to incorporate independent variables*with*an interaction

The resemblance with a linear regression is just not a surprise because a two-way ANOVA, like all ANOVA, is definitely a linear model.

Note that the next code works as well, and provides the identical results:

`# method 2`

mod2 <- lm(body_mass_g ~ sex * species,

data = dat

)

`Anova(mod2)`

`## Anova Table (Type II tests)`

##

## Response: body_mass_g

## Sum Sq Df F value Pr(>F)

## sex 37090262 1 387.460 < 2.2e-16 ***

## species 143401584 2 749.016 < 2.2e-16 ***

## sex:species 1676557 2 8.757 0.0001973 ***

## Residuals 31302628 327

## ---

## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Note that the `aov()`

function assumes a **balanced design**, meaning that now we have equal sample sizes inside levels of our independent grouping variables.

For **unbalanced design**, that’s, unequal numbers of subjects in each subgroup, the beneficial methods are:

- the Type-II ANOVA when there’s
**no**significant interaction, which will be done in R with`Anova(mod, type = "II")`

,5 and - the Type-III ANOVA when there’s a big interaction, which will be done in R with
`Anova(mod, type = "III")`

.

That is beyond the scope of the post and we assume a balanced design here. For the interested reader, see this detailed discussion about type I, type II and kind III ANOVA.

Through the 2 primary effects being significant, we concluded that:

- controlling for the species, body mass is different between females and males, and
- controlling for the sex, body mass is different for not less than one species.

If body mass is different between the 2 sexes, provided that there are exactly two sexes, it should be because body mass is significantly different between females and males.

If one desires to know which sex has the very best body mass, it may possibly be deduced from the means and/or boxplots by subgroup. Here, it is obvious that males have a significantly higher body mass than females.

Nevertheless, it is just not so straightforward for the species. Let me explain why it is just not as easy as for the sexes.

There are three species (Adelie, Chinstrap and Gentoo), so there are 3 pairs of species:

- Adelie and Chinstrap
- Adelie and Gentoo
- Chinstrap and Gentoo

If body mass is significantly different for not less than one species, it could possibly be that:

- body mass is significantly different between Adelie and Chinstrap but not significantly different between Adelie and Gentoo, and never significantly different between Chinstrap and Gentoo, or
- body mass is significantly different between Adelie and Gentoo but not significantly different between Adelie and Chinstrap, and never significantly different between Chinstrap and Gentoo, or
- body mass is significantly different between Chinstrap and Gentoo but not significantly different between Adelie and Chinstrap, and never significantly different between Adelie and Gentoo.

Or, it is also that:

- body mass is significantly different between Adelie and Chinstrap, and between Adelie and Gentoo, but not significantly different between Chinstrap and Gentoo, or
- body mass is significantly different between Adelie and Chinstrap, and between Chinstrap and Gentoo, but not significantly different between Adelie and Gentoo, or
- body mass is significantly different between Chinstrap and Gentoo, and between Adelie and Gentoo, but not significantly different between Adelie and Chinstrap.

Last, it is also that body mass is significantly different between **all** species.

As for a one-way ANOVA, we cannot, at this stage, know precisely which species is different from which one by way of body mass. To know this, we want to check each species two by two because of post-hoc tests (also generally known as pairwise comparisons).

There are several post-hoc tests, essentially the most common one being the Tukey HSD, which tests all possible pairs of groups. As mentioned earlier, this test only must be done on the species variable because there are only two levels for the sex.

As for the one-way ANOVA, the Tukey HSD will be done in R as follows:

`# method 1`

TukeyHSD(mod,

which = "species"

)

`## Tukey multiple comparisons of means`

## 95% family-wise confidence level

##

## Fit: aov(formula = body_mass_g ~ sex * species, data = dat)

##

## $species

## diff lwr upr p adj

## Chinstrap-Adelie 26.92385 -80.0258 133.8735 0.8241288

## Gentoo-Adelie 1377.65816 1287.6926 1467.6237 0.0000000

## Gentoo-Chinstrap 1350.73431 1239.9964 1461.4722 0.0000000

or using the `{multcomp}`

package:

`# method 2`

library(multcomp)

`summary(glht(`

aov(body_mass_g ~ sex + species,

data = dat

),

linfct = mcp(species = "Tukey")

))

`## `

## Simultaneous Tests for General Linear Hypotheses

##

## Multiple Comparisons of Means: Tukey Contrasts

##

##

## Fit: aov(formula = body_mass_g ~ sex + species, data = dat)

##

## Linear Hypotheses:

## Estimate Std. Error t value Pr(>|t|)

## Chinstrap - Adelie == 0 26.92 46.48 0.579 0.83

## Gentoo - Adelie == 0 1377.86 39.10 35.236 <1e-05 ***

## Gentoo - Chinstrap == 0 1350.93 48.13 28.067 <1e-05 ***

## ---

## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

## (Adjusted p values reported -- single-step method)

or using the `pairwise.t.test()`

function using the p-value adjustment approach to your selection:6

`# method 3`

pairwise.t.test(dat$body_mass_g, dat$species,

p.adjust.method = "BH"

)

`## `

## Pairwise comparisons using t tests with pooled SD

##

## data: dat$body_mass_g and dat$species

##

## Adelie Chinstrap

## Chinstrap 0.63 -

## Gentoo <2e-16 <2e-16

##

## P value adjustment method: BH

Note that when using the second method, it’s the model without the interaction that should be specified into the `glht()`

function, even when the interaction is important. Furthermore, don’t forget to exchange `mod`

and `species`

in my code with the name of your model and the name of your independent variable.

Each methods give the identical results, that’s:

- body mass is
*not*significantly different between Chinstrap and Adelie (adjusted p-value = 0.83), - body mass is significantly different between Gentoo and Adelie (adjusted p-value < 0.001), and
- body mass is significantly different between Gentoo and Chinstrap (adjusted p-value < 0.001).

Do not forget that it’s the **adjusted** p-values which might be reported, to forestall the difficulty of multiple testing which occurs when comparing several pairs of groups.

When you would love to check all combos of groups, it may possibly be done with the `TukeyHSD()`

function and specifying the interaction within the `which`

argument:

`# all combos of sex and species`

TukeyHSD(mod,

which = "sex:species"

)

`## Tukey multiple comparisons of means`

## 95% family-wise confidence level

##

## Fit: aov(formula = body_mass_g ~ sex * species, data = dat)

##

## $`sex:species`

## diff lwr upr p adj

## male:Adelie-female:Adelie 674.6575 527.8486 821.4664 0.0000000

## female:Chinstrap-female:Adelie 158.3703 -25.7874 342.5279 0.1376213

## male:Chinstrap-female:Adelie 570.1350 385.9773 754.2926 0.0000000

## female:Gentoo-female:Adelie 1310.9058 1154.8934 1466.9181 0.0000000

## male:Gentoo-female:Adelie 2116.0004 1962.1408 2269.8601 0.0000000

## female:Chinstrap-male:Adelie -516.2873 -700.4449 -332.1296 0.0000000

## male:Chinstrap-male:Adelie -104.5226 -288.6802 79.6351 0.5812048

## female:Gentoo-male:Adelie 636.2482 480.2359 792.2606 0.0000000

## male:Gentoo-male:Adelie 1441.3429 1287.4832 1595.2026 0.0000000

## male:Chinstrap-female:Chinstrap 411.7647 196.6479 626.8815 0.0000012

## female:Gentoo-female:Chinstrap 1152.5355 960.9603 1344.1107 0.0000000

## male:Gentoo-female:Chinstrap 1957.6302 1767.8040 2147.4564 0.0000000

## female:Gentoo-male:Chinstrap 740.7708 549.1956 932.3460 0.0000000

## male:Gentoo-male:Chinstrap 1545.8655 1356.0392 1735.6917 0.0000000

## male:Gentoo-female:Gentoo 805.0947 642.4300 967.7594 0.0000000

Or with the `HSD.test()`

function from the `{agricolae}`

package, which denotes subgroups that should not significantly different from one another with the identical letter:

`library(agricolae)`

`HSD.test(mod,`

trt = c("sex", "species"),

console = TRUE # print results

)

`## `

## Study: mod ~ c("sex", "species")

##

## HSD Test for body_mass_g

##

## Mean Square Error: 95726.69

##

## sex:species, means

##

## body_mass_g std r Min Max

## female:Adelie 3368.836 269.3801 73 2850 3900

## female:Chinstrap 3527.206 285.3339 34 2700 4150

## female:Gentoo 4679.741 281.5783 58 3950 5200

## male:Adelie 4043.493 346.8116 73 3325 4775

## male:Chinstrap 3938.971 362.1376 34 3250 4800

## male:Gentoo 5484.836 313.1586 61 4750 6300

##

## Alpha: 0.05 ; DF Error: 327

## Critical Value of Studentized Range: 4.054126

##

## Groups in line with probability of means differences and alpha level( 0.05 )

##

## Treatments with the identical letter should not significantly different.

##

## body_mass_g groups

## male:Gentoo 5484.836 a

## female:Gentoo 4679.741 b

## male:Adelie 4043.493 c

## male:Chinstrap 3938.971 c

## female:Chinstrap 3527.206 d

## female:Adelie 3368.836 d

If you may have many groups to check, plotting them could be easier to interpret:

`# set axis margins so labels don't get cut off`

par(mar = c(4.1, 13.5, 4.1, 2.1))

`# create confidence interval for every comparison`

plot(TukeyHSD(mod, which = "sex:species"),

las = 2 # rotate x-axis ticks

)

From the outputs and plot above, we conclude that every one combos of sex and species are significantly different, except between female Chinstrap and feminine Adelie (p-value = 0.138) and male Chinstrap and male Adelie (p-value = 0.581).

These results, that are by the best way consistent with the boxplots shown above and which will likely be confirmed with the visualizations below, concludes the two-way ANOVA in R.

When you would love to visualise ends in a distinct solution to what has already been presented within the preliminary analyses, below are some ideas of useful plots.

First, with the mean and standard error of the mean by subgroup using the `allEffects()`

function from the `{effects}`

package:

`# method 1`

library(effects)

`plot(allEffects(mod))`

Or using the `{ggpubr}`

package:

`# method 2`

library(ggpubr)

`ggline(subset(dat, !is.na(sex)), # remove NA level for sex`

x = "species",

y = "body_mass_g",

color = "sex",

add = c("mean_se") # add mean and standard error

) +

labs(y = "Mean of body mass (g)")

Alternatively, using `{Rmisc}`

and `{ggplot2}`

:

`library(Rmisc)`

# compute mean and standard error of the mean by subgroup

summary_stat <- summarySE(dat,

measurevar = "body_mass_g",

groupvars = c("species", "sex")

)# plot mean and standard error of the mean

ggplot(

subset(summary_stat, !is.na(sex)), # remove NA level for sex

aes(x = species, y = body_mass_g, color = sex)

) +

geom_errorbar(aes(ymin = body_mass_g - se, ymax = body_mass_g + se), # add error bars

width = 0.1 # width of error bars

) +

geom_point() +

labs(y = "Mean of body mass (g)")

Second, if you happen to prefer to attract only the mean by subgroup:

`with(`

dat,

interaction.plot(species, sex, body_mass_g)

)

Last but not least, for those of you who’re conversant in GraphPad, you might be more than likely conversant in plotting means and error bars as follows:

`# plot mean and standard error of the mean as barplots`

ggplot(

subset(summary_stat, !is.na(sex)), # remove NA level for sex

aes(x = species, y = body_mass_g, fill = sex)

) +

geom_bar(position = position_dodge(), stat = "identity") +

geom_errorbar(aes(ymin = body_mass_g - se, ymax = body_mass_g + se), # add error bars

width = 0.25, # width of error bars

position = position_dodge(.9)

) +

labs(y = "Mean of body mass (g)")

On this post, we began with a couple of reminders of different tests that exist to check a quantitative variable across groups. We then focused on the two-way ANOVA, ranging from its goal and hypotheses to its implementation in R, along with the interpretations and a few visualizations. We also briefly mentioned its underlying assumptions and one post-hoc test to check all subgroups.

All this was illustrated with the `penguins`

dataset available from the `{palmerpenguins}`

package.

Thanks for reading.

I hope this text will assist you to in conducting a two-way ANOVA together with your data.

As at all times, if you may have a matter or a suggestion related to the subject covered in this text, please add it as a comment so other readers can profit from the discussion.