What Actuaries Should Know About Nonparametric Regression With Missing Data
By Sam Efromovich
To predict one variable, called the response, given another variable, called the predictor, nonparametric regression solves this problem without any assumption about the relationship between these two random variables. Traditional data, used in nonparametric regression, is a sample from the two variables; that is, it is a matrix with two complete columns. In practical applications some observations in that matrix may be missed, and what can be done in this case is the subject of this paper. Three possible scenarios are considered. First, if the probability of missing an observation depends on its value, then no consistent estimation is possible. Second, if all predictors are available and the probability of missing the response depends on value of the predictor, then a nonparametric regression, based on complete cases, is optimal. Third, if all responses are available and the probability of missing the predictor depends on value of the response, then a special estimation procedure, based on all available observations, is optimal. The results are illustrated via examples, and possible extensions are discussed.
Keywords: Adaptation, nonparametric estimation, prediction, regression, probability density