A regression-like situation with count data as response variable.

rm(list=ls())
# install.packages("RCurl")
library(RCurl) # allows accessing data from URL
dd <- read.csv(text=getURL("https://raw.githubusercontent.com/R4All/datasets/master/SoaySheepFitness.csv"))
head(dd)
plot(fitness ~ body.size, data=dd,
main="Weigth at birth and fitness in sheep",
xlab="Weight of the mother at birth (pounds)",
ylab="Number of lambs over lifetime")

lm1 <- lm(fitness ~ body.size, data=dd)
summary(lm1)
# check the model assumptions (graphically):
par(mfrow=c(3,2))
plot(lm1)
hist(lm1$resid)
# data are non-normal and non-linear.

glm1 <- glm(fitness ~ body.size, data=dd, family="poisson"(link = "log"))
summary(glm1)

The function glm() estimates both the model parameters and their Std. Errors on the link (log) scale. The statistical tests are run assuming that thew residuals follow not a Normal, but a Poisson distribution.

# compute SE using predict()
preddata <- data.frame(body.size = seq(from=min(dd$body.size),
to=max(dd$body.size),
by=0.01
))
preds <- predict(glm1, newdata=preddata, type="response", se.fit=T)
# type="link" would give predictions on the link-scale
fit <- preds$fit

# limits of 95% CI:
lwr <- fit - 1.96*preds$se.fit
upr <- fit + 1.96*preds$se.fit

To plot mean values with CI:

plot(fitness ~ body.size, data=dd,
main="Weigth at birth and fitness in sheep",
xlab="Weight of the mother at birth (pounds)",
ylab="Number of lambs over lifetime")
lines(preddata$body.size, preds$fit)
lines(preddata$body.size, lwr, col="red")
lines(preddata$body.size, upr, col="red")

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s