TO FIND THE NEXT OR DIFFERENT PROJECT CLICK ON THE SEARCH BUTTON ON THE TOP RIGHT MENU AND SEARCH USING COURSE CODE OR PROJECT TITLE.
$19.50
1. The simple regression through the origin model is like a simple linear regression model, but
without the intercept:
Yi = β1xi + ei
, i = 1, 2, . . . , n
with E(ei) = 0, Var(ei) = σ
2 > 0, and Cov(ei
, ej ) = 0 if i 6= j.
The ordinary least squares estimate minimizes the residual sum of squares
RSS(β1) = Xn
i=1
yi − β1xi
2
(a) [2 pts] Take the derivative of RSS, and set the resulting expression equal to zero.
(This is sometimes called the called the normal equation.)
(b) [2 pts] Solve the equation of the previous part.
(c) [2 pts] To find your solution in part (b), you made an assumption about the values of
x1, x2, . . . , xn. What is that assumption, and why is it needed?
(d) [2 pts] Show that the expression you found in part (b) really is a minimizer
of RSS(β1). (Hint: Take the second derivative.)
2. For a constant matrix A and a random vector Z,
E(AZ) = A E(Z) Var(AZ) = A Var(Z)AT
(assuming expectations and variances all exist).
Consider the linear model Y = Xβ + e under the Gauss-Markov conditions. For each of the
following random vectors, determine the mean vector and the variance-covariance matrix (in
terms of X, β, and σ
2
). Simplify, if possible.
(a) [2 pts] e
(b) [2 pts] Y
(c) [2 pts] βˆ
(d) [2 pts] Yˆ (the random vector for which the realization is the computed vector yˆ of
fitted values)
3. The data set ais (in package alr4) provides data on athletes. Use help(ais) for
information about the variables. Fit a regression model with weight (kg) as the response,
and sex, height, sum of skin folds, and percent body fat as predictors.
(a) [2 pts] Present a summary of your fitted model. (Use the R summary function.)
(b) [2 pts] Give the least squares estimates of all coefficients.
(c) [2 pts] What is the name for the proportion of variation in the response explained by
the predictors? What is its value, for the model you fit?
(d) [2 pts] Which observation (case number) has the largest (positive) residual? Also,
what is its fitted value?
(e) [2 pts] Supposing all other predictors are held constant, what would be the difference
in weight (kg) for a male compared to a female, according to this model?
(f) [2 pts] Which independent variables are statistically significant at the 5% (0.05) level?
(g) [2 pts] Compute individual 95% confidence intervals for all of the regression
coefficients.
(h) [2 pts] Predict the weight (kg) of a 170 cm tall female with sum of skin folds equal
to 60 and 12% body fat. Also, give a 95% prediction interval.
(i) [2 pts] Fit a model with only sex and height as the predictors, and use an F-test to
compare it with the full model.
4. Using the fuel2001 data set (in package alr4), fit a regression model with FuelC as the
response, and Income, Pop, and Tax as predictors.
(a) [2 pts] Present a summary of your fitted model. (Use the R summary function.)
(b) [2 pts] Using the summary, test the (null) hypothesis that βIncome = 0.
(c) [2 pts] Using the summary, test the (null) hypothesis that βIncome = βPop = βTax = 0.
(d) [2 pts] Add Drivers as another predictor, and present a summary of your fitted model.
(e) [2 pts] Use an F-test to test whether βDrivers = 0.
(f) [2 pts] Compare your results in the previous part with the results of a t-test for
βDrivers = 0. Are they the same?
5. [ GRADUATE SECTION ONLY ]
[4 pts] Weisberg (Fourth Edition), Exercise 2.12.
Some reminders:
• Unless otherwise stated, all data sets can be found in either the alr4 package or the
faraway package in R.
• Unless otherwise stated, use a 5% level (α = 0.05) in all tests.
2