# R from zero to hero
# Marco Plebani 18 May 2018
Let’s perform an ANOVA “by hand” (with a little help from R)
First, enter the data. You could enter them in Excel, save them as a tab-delimited file and then open them in R using
In this case it’s a handful of data so I’m doing it by hand as follows:
# group 1
g1 <- c(6, 7,7.2,8,9)
g2 <- c(1.5,2.5,2.6,5,5.5)
g3 <- c(1, 1.2, 2.3, 4, 5)
all.measures.of.x <- c(g1,g2,g3)
# test F is MS_between / MS_within
# MS_between = SST/(k-1)
# MS_within = SSE/(N-k)
# Let's calculate all the pieces we need.
# N-k = 12
# k-1 = 2
# calculate overall mean and the group means:
grand.mean <- sum(g1,g2,g3)/15
mean.g1 <- mean(g1)
mean.g2 <- mean(g2)
mean.g3 <- mean(g3)
# To calculate SSE:
sse <- sum((g1 - mean.g1)^2) +
# to calculate SST:
sst0 <- length(g1)*(mean.g1 - grand.mean)^2 +
length(g2)*(mean.g2 - grand.mean)^2 +
length(g3)*(mean.g3 - grand.mean)^2
# one can also obtain SST indirectly by remembering that SStotal = SST + SSE,
# so SST = SStotal - SSE
ss_total <- sum((all.measures.of.x - grand.mean)^2)
sst <- ss_total - sse
sst0 == sst
total.variance <- (sse/(15-3)) + (sst/(3-1))
Fvalue <- (sst/(3-1)) / (sse/(15-3))
# we get an F value of about 13.32
# Is it statistically higher than 1?
# let's check the value of F for df=3,12 and p=0.05
# (namely the 95th quantile of the F distribution)
# we can either look it up un a table, or ask R!
qf(p=0.95, df1=2, df2=12) # qf() is a function that gives you the Quantiles for the F distribution - basically it's a F-value calculator. Like a table of F values, just fancier.
So, is at least one of the groups siglificantly different from the others?