STAT 501: Regression Methods
Course Overview
This graduate level course offers an introduction into regression analysis. A researcher is often interested in using sample data to investigate relationships, with an ultimate goal of creating a model to predict a future value for some dependent variable. The process of finding this mathematical model that best fits the data involves regression analysis.
Topics Usually Covered in STAT 501
1. Simple Linear Regression Model: One predictor variable
- Model for E(Y), model for distribution of errors
- Least squares estimation of model for E(Y)
- Estimation of variance
2. Inferences for Simple Linear Model
- Inferences concerning the slope ( confidence intervals and t-test)
- Confidence interval estimate of the mean Y at a specific X
- Prediction interval for a new Y
- Analysis of Variance partitioning of variation in Y
- R-squared, calculation and interpretation
3. Diagnostic procedures for aptness of model
- Residual analyses
- Plots of residuals versus fits, residuals versus x, residuals versus new x
- Tests for normality of residuals
- Lack of Fit test, Pure Error, Lack of Fit concepts
- Transformations as a solution to problems with the model
- Weighted Least Squares a solution for variance problems
4. Matrix Notation and Literacy for Regression Models
- X matrix, beta vector, matrix formula for estimating coefficients
- Linear dependence issues
- Variance-covariance matrix of sample coefficients
5. Multiple Regression Models and Estimation: Multiple predictor variables
- Basic estimation and statistical inference within multiple regression
- Interaction terms and the interpretation of interaction
- Polynomial regression models
6. General Linear F test for testing hypotheses
- Reduced and Full models associated with hypotheses about the model’s coefficients
- F test for general linear hypotheses
- Assessing and interpreting the effect of a single predictor variable within a multiple regression
- Properly interpreting the t-test
- Sequential Sums of Squares
- Partial correlation between y and an x-variable
8. Examining All Possible Regressions to Identify the Potential Models
- R2, MSE , Cp, AIC, BIC, and PRESS criteria
- Stepwise algorithms for identifying models
9. Problems Caused by Correlations (confounding) among Predictor Variables
- Inflation effects on standard deviations of coefficients
- Problems in interpreting effects of individual variables
- Apparent conflicts between overall F test and individual variable t tests
- Benefits of designed experiments
10. Incorporating Categorical Predictor Variables
- Indicator Variables
- Interpretation of models containing indicator variables
- Piecewise regression
11. More Diagnostic Measures and Remedial Measures for Lack of Fit
- Variance Inflation Factors
- Ridge Regression
- Deleted Residuals
- Influence statistics - Hat matrix, Cook's D and related measures
12. Logistic Regression Models for a Binary Response variable
13. Time Series Issues: Autocorrelation in errors and autoregressive time series models