[1] Chen P Y, Zhang H, Sharma Y, et al. Zoo: Zeroth order optimization based black-box attacks to deep neural networks without training substitute models [C]//Proceedings of the 10th ACM Workshop on Artificial Intelligence and Security, 2017: 15-26. [2] Finlay C, Oberman A M. Scaleable input gradient regularization for adversarial robustness [J]. Machine Learning with Applications, 2021, 3: 100017. [3] Snoek J, Larochelle H, Adams R P. Practical Bayesian optimization of machine learning algorithms [C]//Advances in Neural Information Processing Systems, 2012: 2951-2959. [4] Luo L, Ye H, Huang Z, et al. Stochastic recursive gradient descent ascent for stochastic nonconvex-strongly-concave minimax problems [C]//Advances in Neural Information Processing Systems, 2020: 20566-20577. [5] Nouiehed M, Sanjabi M, Huang T, et al. Solving a class of non-convex min-max games using iterative first order methods [C]//Advances in Neural Information Processing Systems, 2019: 14934-14942. [6] Lin T, Jin C, Jordan M I. Near-optimal algorithms for minimax optimization [C]//Conference on Learning Theory, 2020: 2738-2779. [7] Daskalakis C, Panageas I. The limit points of (optimistic) gradient descent in min-max optimization [C]//Advances in Neural Information Processing Systems, 2018: 9236-9246. [8] Lin T, Jin C, Jordan M. On gradient descent ascent for nonconvex-concave minimax problems [C]//International Conference on Machine Learning, 2020: 6083-6093. [9] Lu S, Tsaknakis I, Hong M, et al. Hybrid block successive approximation for one-sided nonconvex min-max problems: Algorithms and applications [J]. IEEE Transactions on Signal Processing, 2020, 68: 3676-3691. [10] Pan W, Shen J, Xu Z. An efficient algorithm for nonconvex-linear minimax optimization problem and its application in solving weighted maximin dispersion problem [J]. Computational Optimization and Applications, 2021, 78(1): 287-306. [11] Shen J, Wang Z, Xu Z. Zeroth-order single-loop algorithms for nonconvex-linear minimax problems [J]. Journal of Global Optimization, 2022: 1-30. [12] Xu Z, Shen J, Wang Z, et al. Zeroth-order alternating randomized gradient projection algorithms for general nonconvex-concave minimax problems [EB/OL]. [2022-12-01]. arXiv: 2108.00473. [13] Xu Z, Zhang H, Xu Y, et al. A unified single-loop alternating gradient projection algorithm for nonconvex-concave and convex-nonconcave minimax problems [J]. Mathematical Programming, 2023, 201: 635-706. [14] Zhang J, Xiao P, Sun R, et al. A single-loop smoothed gradient descent-ascent algorithm for nonconvex-concave min-max problems [C]//Advances in Neural Information Processing Systems, 2020: 7377-7389. [15] Johnson R, Zhang T. Accelerating stochastic gradient descent using predictive variance reduction [C]//Advances in Neural Information Processing Systems, 2013: 315-323. [16] Defazio A, Bach F, Lacoste-Julien S. SAGA: A fast incremental gradient method with support for non-strongly convex composite objectives [C]//Advances in Neural Information Processing Systems, 2014: 1646-1654. [17] Nguyen L M, Liu J, Scheinberg K, et al. SARAH: A novel method for machine learning problems using stochastic recursive gradient [C]//International Conference on Machine Learning, 2017: 2613-2621. [18] Fang C, Li C J, Lin Z, et al. Spider: Near-optimal non-convex optimization via stochastic path-integrated differential estimator [C]//Advances in Neural Information Processing Systems, 2018: 9236-9246. [19] Wang Z, Ji K, Zhou Y, et al. Spiderboost and momentum: Faster variance reduction algorithms [C]//Advances in Neural Information Processing Systems, 2019: 2406-2416. [20] Rafique H, Liu M, Lin Q, et al. Weakly-convex-concave min-max optimization: Provable algorithms and applications in machine learning [J]. Optimization Methods and Software, 2022, 37(3): 1087-1121. [21] Wai H T, Hong M, Yang Z, et al. Variance reduced policy evaluation with smooth function approximation [C]//Advances in Neural Information Processing Systems, 2019: 5776-5787. [22] Yang J, Orvieto A, Lucchi A, et al. Faster single-loop algorithms for minimax optimization without strong concavity [C]//International Conference on Artificial Intelligence and Statistics, 2022: 5485-5517. [23] Xu Z, Wang Z Q, Wang J L, et al. Zeroth-order alternating gradient descent ascent algorithms for a class of nonconvex-nonconcave minimax problems [EB/OL]. [2022-11-01]. arXiv: 2211.13668. [24] Polyak B T. Gradient methods for minimizing functionals [J]. Zhurnal Vychislitel’noi Matematiki i Matematicheskoi Fiziki, 1963, 3(4): 643-653. [25] Karimi H, Nutini J, Schmidt M. Linear convergence of gradient and proximal-gradient methods under the polyak- lojasiewicz condition [C]//Joint European Conference on Machine Learning and Knowledge Discovery in Databases, 2016: 795-811. [26] Du S, Lee J, Li H, et al. Gradient descent finds global minima of deep neural networks [C]//International Conference on Machine Learning, 2019: 1675-1685. [27] Fazel M, Ge R, Kakade S, et al. Global convergence of policy gradient methods for the linear quadratic regulator [C]//International Conference on Machine Learning, 2018: 1467-1476. [28] Cai Q, Hong M, Chen Y, et al. On the global convergence of imitation learning: A case for linear quadratic regulator [EB/OL].[2022-12-01]. arXiv:1901.03674. [29] Liu C, Zhu L, Belkin M. Loss landscapes and optimization in over-parameterized non-linear systems and neural networks [J]. Applied and Computational Harmonic Analysis, 2022, 59: 85-116. |