Quarterly (spring, summer, fall, winter)
176 pp. per issue
7 x 10
ISSN
1063-6560
E-ISSN
1530-9304
2014 Impact factor:
2.37

Evolutionary Computation

Spring 2017, Vol. 25, No. 1, Pages 143-171
(doi: 10.1162/EVCO_a_00168)
© 2017 Massachusetts Institute of Technology
LM-CMA: An Alternative to L-BFGS for Large-Scale Black Box Optimization
Article PDF (1.67 MB)
Abstract

Limited-memory BFGS (L-BFGS; Liu and Nocedal, 1989) is often considered to be the method of choice for continuous optimization when first- or second-order information is available. However, the use of L-BFGS can be complicated in a black box scenario where gradient information is not available and therefore should be numerically estimated. The accuracy of this estimation, obtained by finite difference methods, is often problem-dependent and may lead to premature convergence of the algorithm.

This article demonstrates an alternative to L-BFGS, the limited memory covariance matrix adaptation evolution strategy (LM-CMA) proposed by Loshchilov (2014). LM-CMA is a stochastic derivative-free algorithm for numerical optimization of nonlinear, nonconvex optimization problems. Inspired by L-BFGS, LM-CMA samples candidate solutions according to a covariance matrix reproduced from m direction vectors selected during the optimization process. The decomposition of the covariance matrix into Cholesky factors allows reducing the memory complexity to , where n is the number of decision variables. The time complexity of sampling one candidate solution is also but scales as only about 25 scalar-vector multiplications in practice. The algorithm has an important property of invariance with respect to strictly increasing transformations of the objective function; such transformations do not compromise its ability to approach the optimum. LM-CMA outperforms the original CMA-ES and its large-scale versions on nonseparable ill-conditioned problems with a factor increasing with problem dimension. Invariance properties of the algorithm do not prevent it from demonstrating a comparable performance to L-BFGS on nontrivial large-scale smooth and nonsmooth optimization problems.