Week 14
Nov 17, 2025
In this lab we will practice:
Textbook Reference: JA Chapter 17
Think about:
- What does the slope represent in a regression line?
- Does correlation imply causation?
- Why do we square residuals in OLS?
Caution
x and y?geom_smooth(method="lm") and confirm visually.Compute slope and intercept manually using formulas:
\[ \hat{\beta} = \frac{\sum_i (x_i - \bar{x})(y_i - \bar{y})}{\sum_i (x_i - \bar{x})^2}, \qquad \hat{\alpha} = \bar{y} - \hat{\beta}\bar{x}. \]
Compare with R’s built-in estimator:
Call:
lm(formula = y ~ x, data = simdata)
Residuals:
Min 1Q Median 3Q Max
-4.4759 -1.2265 -0.0395 1.1927 4.4345
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 4.98208 0.39211 12.71 <2e-16 ***
x 1.98203 0.06836 28.99 <2e-16 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 1.939 on 98 degrees of freedom
Multiple R-squared: 0.8956, Adjusted R-squared: 0.8945
F-statistic: 840.6 on 1 and 98 DF, p-value: < 2.2e-16
Question: How does education relate to weekly earnings?
Call:
lm(formula = earnwk ~ educ, data = cps)
Residuals:
Min 1Q Median 3Q Max
-1272.1 -417.3 -157.4 229.1 7282.6
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) -330.753 72.713 -4.549 5.63e-06 ***
educ 101.550 5.575 18.217 < 2e-16 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 709.8 on 2807 degrees of freedom
(1204 observations deleted due to missingness)
Multiple R-squared: 0.1057, Adjusted R-squared: 0.1054
F-statistic: 331.9 on 1 and 2807 DF, p-value: < 2.2e-16
geom_segment().Use the fitted model to predict average earnings for 12, 14, and 16 years of education.
Simulate a new dataset where \(Y = 5 + 2X + U\) but \(U\) is correlated with \(X\) (e.g., U <- 0.5*X + rnorm(n)).
Estimate the regression again and compare the slope.
Question: Does the estimated slope still recover the true value 2? Why not?
Under what condition can we interpret the slope \(\hat{\beta}\) as a causal effect?
Submit the rendered PDF or HTML report on Canvas as a group.
Be sure to include your plots, coefficient outputs, and short written interpretations.
ECON2250 Statistics for Economics – Fall 2025 – Maghfira Ramadhani