Abstract
A class of variable selection procedures for parametric models via
nonconcave penalized likelihood was proposed in Fan and Li (2001a). It
has been shown there that the resulting procedures perform as well as if
the subset of significant variables were known in advance. Such a property
is called an oracle property. The proposed procedures were illustrated in
the context of linear regression, robust linear regression and generalized
linear models. In this paper, the nonconcave penalized likelihood approach
is extended further to the Cox proportional hazards model and the Cox
proportional hazards frailty model, two commonly used semi-parametric
models in survival analysis. As a result, new variable selection procedures for
these two commonly-used models are proposed. It is demonstrated how the
rates of convergence depend on the regularization parameter in the penalty
function. Further, with a proper choice of the regularization parameter and
the penalty function, the proposed estimators possess an oracle property.
Standard error formulae are derived and their accuracies are empirically
tested. Simulation studies show that the proposed procedures are more
stable in prediction and more effective in computation than the best subset
variable selection, and they reduce model complexity as effectively as the
best subset variable selection. Compared with the LASSO, which is the
penalized likelihood method with the L1-penalty, proposed by Tibshirani, the
newly proposed approaches have better theoretic properties and finite sample
performance.