Wildlife researchers are monitoring a Florida alligator population by
taking areal photographs and attempting to estimate the weights of the
gators based on the length of the gators in the images. The data set
Gators.csv
contains the variables Length
and
Weight
for a sample of alligators who have been captured
and studied.
We are going to design a simple linear regression model and then assess the model fit and interpretation. Click this link to view the worksheet for this gator data analysis problem.
gators <- read.csv("https://raw.githubusercontent.com/dr-suz/Stat11/main/Data/Gators.csv")
After importing your data, you can fit a simple linear regression model by using the following code.
slr <- lm(<response> ~ <predictor>)
You can then see all the details of your linear model by using the summary function.
summary(slr)
Finally, to create a plot of your data with the linear regression line overlayed, use the following code.
plot(<predictor> , <response>)
abline(slr)
It’s important to consider the roles of the explanatory variable and the response variable.
Just because R (or Excel) fits a line and gives us a slope and intercept doesn’t mean that the model is appropriate or information. We must consider the assumptions for a SLR model:
the relationship between predictor and response is linear enough
there is no thickening or thinning of the scatterplot when read from left to right
there are no outliers
Note, we have not yet had time in class to discuss the phenomena of regression to the mean or the issues posed by lurking variables. You are still responsible for understanding this material. (These might be good topics for us to cover during the review days before Quiz 1.)