高阶优化算法是利用目标函数的高阶导数信息进行优化的算法,是最优化领域中的一个新兴的研究方向.高阶算法具有更低的迭代复杂度,但是需要求解一个更难的子问题.主要介绍三种高阶算法,分别为求解凸问题的高阶加速张量算法和A-HPE框架下的最优张量算法,以及求解非凸问题的ARp算法.同时也介绍了怎样求解高阶算法的子问题.希望通过对高阶算法的介绍,引起更多学者的关注与重视.
High-order methods are the recently developed optimization algorithms of using high-order information in the process of iteration. The high-order methods often have lower iteration complexity yet a harder subproblem to solve comparing to first-order methods. In this paper, we mainly surveyed three high-order methods including accelerated tensor method, the optimal tensor method, and the ARp method. The solution methods of the subproblems associated with those methods are discussed as well. Hopefully, the interested readers will pay more attention to this research topic by reading the recent advances of high-order methods summarized in this paper.
[1] Yurii Nesterov. A method for unconstrained convex minimization problem with the rate of convergence o (1/k2)[J].Doklady AN USSR, 10983, 269, 543-547.
[2] YuNesterov.Accelerating the cubic regularization of newton's method on convex problems[J].Mathematical Programming, 2008, 112(1):159-181.
[3] RenatoDC Monteiro, BenarFux Svaiter. An accelerated hybrid proximal extragradient method for convex optimization and its implications to second-order methods[J].SIAM Journal on Optimization, 2013, 23(2):1092-1125.
[4] Yossi Arjevani, Ohad Shamir, Ron Shiff. Oracle complexity of second-order methods for smooth convex optimization[J]. Mathematical Programming, 2017, 1-34.
[5] Yurii Nesterov.Implementable tensor methods in unconstrained convex optimization. Universite catholique de Louvain, Center for Operations Research and Econometrics (CORE), 2018.
[6] JiangBo, Lin Tianyi, Zhang Shuzhong. A unified adaptive tensor approximation scheme to accelerate composite convex optimization[J]. arXiv preprint arXiv:1811.02427, 2018.
[7] Alexander Gasnikov, Dmitry Kovalev, Ahmed Mohhamed, et al. The global rate of convergence for optimal tensor methods in smooth convex optimization[J]. arXiv preprint arXiv:1809.00382, 2018.
[8] ZhangShuzhong, JiangBo, WangHaoyue. An optimal high-order tensor method for convex optimization[J].arXiv preprint arXiv:1812.06557, 2018.
[9] Sébastien Bubeck, Qijia Jiang, YinTat Lee, et al. Near-optimal method for highly smooth convex optimization[J].arXiv preprint arXiv:1812.08026, 2018.
[10] Coralia Cartis, NicholasIM Gould, PhilippeL Toint. Adaptive cubic regularisation methods for unconstrained optimization. part i:motivation, convergence and numerical results[J].Mathematical Programming, 2011, 127(2):245-295.
[11] Coralia Cartis, NicholasIM Gould, PhilippeL Toint. Adaptive cubic regularisation methods for unconstrained optimization. part ii:worst-case function-and derivative-evaluation complexity[J].Mathematical programming, 2011, 130(2):295-319.
[12] Coralia Cartis, NicholasIM Gould, PhL Toint. Complexity bounds for second-order optimality in unconstrained optimization[J].Journal of Complexity, 2012, 28(1):93-108.
[13] ErnestoG Birgin, JLGardenghi, JoséMario Martínez, et al. Worst-case evaluation complexity for unconstrained nonlinear optimization using high-order regularized models[J]. Mathematical Programming, 2017, 163(1-2):359-368.
[14] Coralia Cartis, NicholasIM Gould, PhilippeL Toint. Improved second-order evaluation complexity for unconstrained nonlinear optimization using high-order regularized models[J]. arXiv preprint arXiv:1708.04044, 2017.
[15] Peter Auer, Mark Herbster, ManfredK Warmuth. Exponentially many local minima for single neurons[J]. In Advances in neural information processing systems, 1996, 316-322.
[16] Antonio Auffinger, GerardBen Arous, etal. Complexity of random smooth functions on the high-dimensional sphere[J]. The Annals of Probability, 2013, 41(6):4214-4247.
[17] Animashree Anandkumar, Rong Ge. Efficient approaches for escaping higher order saddle points in non-convex optimization[J]. In Conference on learning theory, 2016, 81-102.
[18] Coralia Cartis, NickIM Gould, PhilippeL Toint. Second-order optimality and beyond:Characterization and evaluation complexity in convexly constrained nonlinear optimization[J]. Foundations of Computational Mathematics, 2018, 18(5):1073-1107.
[19] AndrewR Conn, NicholasIM Gould, PhL Toint. Trust region methods, volume1[M]. Siam, 2000.
[20] Celestine Dünner, Aurelien Lucchi, Matilde Gargiani, et al. A distributed second-order algorithm you can trust[J]. arXiv preprint arXiv:1806.07569, 2018.
[21] Davood Hajinezhad, Mingyi Hong, Alfredo Garcia. Zeroth order nonconvex multi-agent optimization over networks[J]. arXiv preprint arXiv:1710.09997, 2017.
[22] Angelia Nedic, Asuman Ozdaglar. Distributed subgradient methods for multi-agent optimization[J]. IEEE Transactions on Automatic Control, 2009, 54(1):48.
[23] HeinzH Bauschke, Jérôme Bolte, Marc Teboulle. A descent lemma beyond lipschitz gradient continuity:first-order methods revisited and applications[J]. Mathematics of Operations Research, 2016, 42(2):330-348.
[24] Haihao Lu, RobertM Freund, Yurii Nesterov. Relatively smooth convex optimization by first-order methods, and applications[J]. SIAM Journal on Optimization, 2018, 28(1):333-354.