Getting Started:
Analyzing Data with Missing Values

 

Department of Statistics

The Methodology Center

Penn State

   

Data from surveys, experiments and observational studies typically contain missing values.  Because most analysis procedures were not designed to handle incomplete data, researchers often resort to editing procedures (deleting incomplete cases, replacing the missing values with sample means, etc.) to lend an appearance of completeness. These ad hoc edits may do more harm than good, producing answers that are biased, inefficient or unreliable. Fortunately, many principled techniques for missing values are available.  Through this website, we hope to provide a general introduction to missing data and the statistical tools used to handle them.

If you need more information, a comprehensive treatment of statistical analysis with missing data is given by Little and Rubin (2002).  A shorter and gentler book by Allison (2002) is written primarily for social scientists.  Another good starting point is the overview article by Schafer and Graham (2002), from which some of the material in these pages has been drawn.


Comments?

Send us email if you have a comment or question on the content of this website.  Unfortunately, we cannot provide consulting or answer questions of a specific nature regarding individual projects or analyses.


This website is hosted by The Methodology Center in the College of Health and Human Development, The Pennsylvania State University.

Support for this project is provided by a grant from the National Institute on Drug Abuse, 2-P50-DA10075.

Revised: 01/12/06.