From numerical optimization method to learning optimization method

Expand
  • School of Mathematical Sciences, University of Chinese Academy of Sciences, Beijing 100049, China

Received date: 2019-11-13

  Online published: 2019-12-04

Abstract

The traditional optimization method based on the gradient solver is mainly the numerical optimization method. It is an iterative solution method combining analytical and numerical calculation and is an optimization method based on fixed mode. The iterative process of the numerical optimization algorithm is essentially the process of nonlinear transformation of the iterative point, which is realized by a series of directions and steps. For every instance of optimization problem, the whole algorithm needs to be executed from the beginning to the end, and the computational complexity is fixed. Once the algorithm is programmed, the efficiency (accuracy and complexity) of the algorithm is fixed. With the development of artificial intelligence, especially deep learning, learning methods have made great success in some fields, such as image recognition (especially face recognition, license plate recognition, handwritting recognition, etc.), network attack prevention, natural language processing, automatic driving, finance, medical treatment, etc. This paper studies the traditional numerical optimization method and intelligent optimization method from a new perspective, analyzes their characteristics respectively. Then we not only propose the learning optimization method but also put forward the design idea of learning optimization method. Finally, we take combinatorial optimization as an example to explain the design principle of this kind of method.

Cite this article

GUO Tiande, HAN Congying . From numerical optimization method to learning optimization method[J]. Operations Research Transactions, 2019 , 23(4) : 1 -12 . DOI: 10.15960/j.cnki.issn.1007-6093.2019.04.001

References

[1] Papadimitriou C H, Steiglitz K. 组合最优化算法和复杂性[M]. 刘振宏, 蔡茂诚, 译. 北京:高等教育出版社, 1988.
[2] Fletcher R. Practical Methods of Optimization[M]. Chichester:John Wiley & Sons, 2008.
[3] 袁亚湘, 孙文瑜. 最优化理论与方法[M]. 北京:科学出版社, 2007.
[4] 袁亚湘. 非线性优化计算方法[M]. 北京:科学出版社, 2008.
[5] Nesterov Y. A method of solving a convex programming problem with convergence rate O(1/√k)[J]. Soviet Mathematics Doklady, 1983, 27:372-376.
[6] Fletcher R, Powell M J D. A rapid convergent descent method for minimization[J]. The Computer Journal, 1963, 6:163-168.
[7] Broyden C G. A class of methods for solving nonlinear simultaneous equations[J]. Mathematics of Computation, 1965, 19:577-593.
[8] Liu D, Nocedal J. On the limited memory method for large scale optimization[J]. Mathematical Programming, 1989, 45(3):503-528.
[9] Li D H, Fukushima M. A modified BFGS method and its global convergence in nonconvex minimization[J]. Journal of Computational and Applied Mathematics, 2001, 129:15-35.
[10] Yuan Y. An example of only linearly convergence of trust region algorithms for nonsmooth optimization[J]. IMA Journal of Numerical Analysis, 1984, 4:327-335.
[11] Yuan Y. Conditions for convergence of trust region algorithms for nonsmooth optimization[J]. Mathematical Programming, 1985, 31:220-228.
[12] Yuan Y. On the superlinear convergence of a trust region algorithm for nonsmooth optimization[J]. Mathematical Programming, 1985, 31:269-285.
[13] Dai Y, Yuan Y. A nonlinear conjugate gradient method with a strong global convergence property[J]. SIAM Journal on Optimization, 1999, 10(1):177-182.
[14] Lemaréchal C, Strodiot J J, Bihain A. On a bundle algorithm for nonsmooth optimization[J]. Nonlinear Programming, 1981, 4:245-282.
[15] Beck A, Teboulle M. A fast iterative shrinkage-thresholding algorithm for linear inverse problems[J]. SIAM Journal on Imaging Sciences, 2009, 2(1):183-202.
[16] Glowinski R, Marroco A. On the solution of a class of nonlinear Dirichlet problems by a penalty-duality method and finite elements of order one[J]. ESAIM:Mathematical Modelling and Numerical Analysis, 1975, 9(R2):41-76.
[17] Gabay D, Mercier B. A dual algorithm for the solution of nonlinear variational problems via finite element approximations[J]. Computers and Mathematics with Applications, 1976, 2(1):17-40.
[18] Boyd S, Parikh N, et al. Distributed optimization and statistical learning via the alternating direction method of multipliers[J]. Foundations and Trends in Machine Learning, 2011, 3(1):1-122.
[19] 何炳生. 我和乘子交替方向法20年[J]. 运筹学学报, 2018, 22(1):1-31.
[20] Bottou, Léon, et al. Optimization methods for large-scale machine learning[J]. SIAM Review 60, 2018, 223-311.
[21] Schrijver A. Combinatorial Optimization:Polyhedron and Efficiency[M]. Berlin:Springer-Verlag, 2003.
[22] Korte B, Vygen J. Combinatorial Optimization:Theory and Algorithm[M]. Berlin:Springer-Verlag, 2012.
[23] Vazirani V V. Approximation Algorithms[M]. Berlin:Springer-Verlag, 2001.
[24] Williamson D P, Shmoys D B. The Design of Approximation Algorithms[M]. New York:Cambridge University Press, 2011.
[25] Arora S, Barak B. Computational Complexity:A Modern Approach[M]. New York:Cambridge University Press, 2009.
[26] Eiben A E, Smith J E. Introduction to Evolutionary Computing[M]. New York:Springer, 2003.
[27] 郭田德, 韩丛英, 李明强. 逐层数据再表达的前后端融合学习的理论及其模型和算法[J]. 中国科学(信息科学), 2019, 49(6):739-759.
[28] Lecun Y, Bottou L, Bengio Y, et al. Gradient-based learning applied to document recognition[C]//Proceedings of the IEEE, 1998, 86(11):2278-2324.
[29] Yoshua Bengio, Andrea Lodi, Antoine Prouvost. Machine learning for combinatorial optimization:a methodological tour d'horizon[J]. arXiv:1811.06128, 2018.
[30] Elias Boutros Khalil, Pierre Le Bodic, Song L, et al. Learning to branch in mixed integer programming[C]//Proceeding of the Thirtieth AAAI Conference on Artificial Intelligence, 2016.
[31] Léon Bottou, Frank E Curtis, Jorge Nocedal. Optimization methods for large-scale machine learning[J]. SIAM Review, 2018, 60(2):223-311.
[32] Marcin Andrychowicz, Misha Denil, Sergio Gomez, et al. Learning to learn by gradient descent by gradient descent[J]. Advances in Neural Information Processing Systems, 2016:3981-3989.
[33] Alejandro Marcos Alvarez, Quentin Louveaux, Louis Wehenkel. A machine learning-based approximation of strong branching[J]. INFORMS Journal on Computing, 2017, 29(1):185-195.
[34] He He, Hal Daume III, Jason M Eisner. Learning to search in branch and bound algorithms[J]. Advances in Neural Information Processing Systems, 2014:3293-3301.
[35] Anton Milan, Seyed Hamid Rezatofighi, Ravi Garg, et al. Data-driven approximations to np-hard problems[C]//Proceedings of the Conference on Artificial Intelligence, San Francisco, 2017:1453-1459.
[36] Oriol Vinyals, Meire Fortunato, Navdeep Jaitly. Pointer networks[C]//Proceedings of the Advances in Neural Information Processing Systems, Montreal, 2015:2692-2700.
[37] Irwan Bello, Hieu Pham, Quoc V Le, et al. Neural combinatorial optimization with reinforcement learning[J]. arXiv:1611.09940, 2016.
[38] Mohammadreza Nazari, Afshin Oroojlooy, Lawrence Snyder, et al. Reinforcement learning for solving the vehicle routing problem[J]. Advances in Neural Information Processing Systems, 2018:9839-9849.
[39] Elias Khalil, Dai H J, Zhang Y Y, et al. Learning combinatorial optimization algorithms over graphs[C]// Proceedings of the Advances in Neural Information Processing Systems, Long Beach, 2017:6348-6358.
[40] 唐思琦. 基于深度学习的组合优化问题求解及其应用[M]. 北京:中国科学院大学, 2019.
[41] Ilya Sutskever, Oriol Vinyals, Quoc V Le. Sequence to sequence learning with neural networks[C]//Proceedings of the Advances in Neural Information Processing Systems, Montreal, 2014:3104-3112.
[42] Kyunghyun Cho, Bart Van MerriÄenboer, Caglar Gulcehre, et al. Learning phrase representations using RNN encoder-decoder for statistical machine translation[J]. arXiv:1406.1078, 2014.
[43] Dzmitry Bahdanau, Kyunghyun Cho, Yoshua Bengio. Neural machine translation by jointly learning to align and translate[J]. arXiv:1409.0473, 2014.
[44] 郭田德, 韩丛英, 唐思琦. 组合优化的机器学习求解方法[M]. 北京:科学出版社, 2019.
Outlines

/