Abstract
Variable selection is fundamental to high-dimensional statistical modeling.
Many variable selection techniques may be implemented by maximum
penalized likelihood using various penalty functions. Optimizing the
penalized likelihood function is often challenging because it may be nondifferentiable
and/or nonconcave. This article proposes a new class of algorithms
for finding a maximizer of the penalized likelihood for a broad class of
penalty functions. These algorithms operate by perturbing the penalty function
slightly to render it differentiable, then optimizing this differentiable
function using a minorize–maximize (MM) algorithm. MM algorithms are
useful extensions of the well-known class of EM algorithms, a fact that allows
us to analyze the local and global convergence of the proposed algorithm
using some of the techniques employed for EM algorithms. In particular, we
prove that when our MM algorithms converge, they must converge to a desirable
point; we also discuss conditions under which this convergence may be
guaranteed. We exploit the Newton–Raphson-like aspect of these algorithms
to propose a sandwich estimator for the standard errors of the estimators. Our
method performs well in numerical tests.