运筹学学报(中英文) ›› 2026, Vol. 30 ›› Issue (1): 217-234.doi: 10.15960/j.cnki.issn.1007-6093.2026.01.016
• • 上一篇
杨金佶, 沈春根†, 宇振盛
收稿日期:2022-11-01
发布日期:2026-03-16
通讯作者:
沈春根 E-mail:shenchungen@usst.edu.cn
基金资助:YANG Jinji, SHEN Chungen†, YU Zhensheng
Received:2022-11-01
Published:2026-03-16
摘要: 模型由于其对于正则化参数的选择不依赖于残差的方差估计这一特点,从而受到了广泛关注。但平方根损失函数存在不可微点, 这给SQRT-Lasso模型的算法设计带来了困难。本文在Li 等人(2020)工作的基础上改进了平方根损失函数的局部光滑性与局部限制强凸性的证明;为了克服损失函数因存在不可微点而导致的计算困难,设计了自适应邻近梯度-次梯度算法(APGSA);在一定的假设条件下, 证明了所提算法在高概率意义下的全局收敛性。此外,本文还证明了算法在有限步迭代后准确探测出积极流形,进而得到了高概率意义下的局部线性收敛速度。最后通过仿真实验验证了算法(APGSA) 的有效性和局部线性收敛速度。
中图分类号:
杨金佶, 沈春根, 宇振盛. 关于求解平方根损失函数回归问题的自适应邻近梯度-次梯度算法的收敛性分析[J]. 运筹学学报(中英文), 2026, 30(1): 217-234.
YANG Jinji, SHEN Chungen, YU Zhensheng. Convergence analysis of an adaptive proximal gradient-subgradient algorithm for square-root-loss regression problems[J]. Operations Research Transactions, 2026, 30(1): 217-234.
| [1] Hocking R R. The analysis and selection of variables in linear regression [J]. Biometrics, 1976, 32: 1-49. [2] Kira K, Rendell L A. The feature selection problem: Traditonal methods and a new algorithm [C]//Proceedings of the AAAI Conference on Artificial Intelligence, 1992, 10: 129-134. [3] Akaike H. Information theory and an extension of the maximum likelihood principle [M]//Selected Papers of Hirotugu Akaike, New York: Springer, 1998: 610-624. [4] Schwarz G. Estimating the dimension of a model [J]. The Annals of Statistics, 1978, 6(2): 461-464. [5] Tibshirani R. Regression shrinkage and selection via the lasso [J]. Journal of the Royal Statistical Society Series B-Statistical Methodology, 1996, 58(1): 267-288. [6] Mohri M, Rostamizadeh A, Talwalkar A. Foundations of Machine Learning [M]. Cambridge: MIT Press, 2018. [7] Donoho D L. Compressed sensing [J]. IEEE Transactions on Information Theory, 2006, 52(4): 1289-1306. [8] Liu X, Shen C G, Wang L. A dual active-set proximal Newton algorithm for sparse approximation of correlation matrices [J]. Optimization Methods and Software, 2022, 37(5): 1820-1844. [9] Shen C G, Xue W J, Zhang L-H, et al. An active-set proximal Newton algorithm for ‘1- regularized optimization problems with box constraints [J]. Journal of Scientific Computing, 2020, 85: 1-34. [10] Beck A, Teboulle M. A fast iterative shrinkage-thresholding algorithm for linear inverse problems [J]. SIAM Journal on Imaging Sciences, 2009, 2(1): 183-202. [11] Becker S, Bobin J, Candes E J. Nesta: A fast and accurate first-order method for sparse recovery [J]. SIAM Journal on Imaging Sciences, 2011, 4(1): 1-39. [12] Van Den Berg E, Friedlander M P. Probing the Pareto frontier for basis pursuit solutions [J]. SIAM Journal on Scientific Computing, 2009, 31(2): 890-912. [13] Byrd R H, Chin G M, Nocedal J, et al. A family of second-order methods for convex ‘1- regularized optimization [J]. Mathematical Programming Series A, 2016, 159(1): 435-467. [14] Efron B, Hastie T, Johnstone I, et al. Least angle regression [J]. The Annals of Statistics, 2004, 32(2): 407-499. [15] Figueiredo M A T, Nowak R D, Wright S J. Gradient projection for sparse reconstruction: Application to compressed sensing and other inverse problems [J]. IEEE Journal of Selected Topics in Signal Processing, 2007, 1(4): 586-597. [16] Hale E T, Yin W, Zhang Y. Fixed-point continuation for ‘1-minimization: Methodology and convergence [J]. SIAM Journal on Optimization, 2008, 19(3): 1107-1130. [17] Keskar N, Nocedal J, Oztoprak F, et al. A second-order method for convex ‘1-regularized optimization with active-set prediction [J]. Optimization Methods and Software, 2016, 31(3): 605-621. [18] Li X D, Sun D F, Toh K-C. A highly efficient semismooth Newton augmented Lagrangian method for solving Lasso problems [J]. SIAM Journal on Optimization, 2018, 28(1): 433-458. [19] Milzarek A, Ulbrich M. A semismooth Newton method with multidimensional filter globalization for ‘1-optimization [J]. SIAM Journal on Optimization, 2014, 24(1): 298-333. [20] Wen Z W, Yin W T, Goldfarb D, et al. A fast algorithm for sparse reconstruction based on shrinkage, subspace optimization, and continuation [J]. SIAM Journal on Scientific Computing, 2010, 32(4): 1832-1857. [21] Wright S J, Nowak R D, Figueiredo M A T. Sparse reconstruction by separable approximation [J]. IEEE Transactions on Signal Processing, 2009, 57(7): 2479-2493. [22] Xiao X T, Li Y F, Wen Z W, et al. A regularized semi-smooth Newton method with projection steps for composite convex programs [J]. Journal of Scientific Computing, 2018, 76(1): 364- 389. [23] Yuan M, Lin Y. Model selection and estimation in regression with grouped variables [J]. Journal of the Royal Statistical Society Series B-Statistical Methodology, 2006, 68(1): 49-67. [24] Li X D, Sun D F, Toh K-C. On efficiently solving the subproblems of a level-set method for fused Lasso problems [J]. SIAM Journal on Optimization, 2018, 28(2): 1842-1866. [25] Zhang Y J, Zhang N, Sun D F, et al. An efficient Hessian based algorithm for solving large-scale sparse group Lasso problems [J]. Mathematical Programming, 2020, 179(1): 223-263. [26] Zhang Y J, Zhang N, Sun D F, et al. A proximal point dual Newton algorithm for solving group graphical Lasso problems [J]. SIAM Journal on Optimization, 2020, 30(3): 2197-2220. [27] Tibshirani R. Regression shrinkage and selection via the Lasso: A retrospective [J]. Journal of the Royal Statistical Society Series B-Statistical Methodology, 2011, 73(3): 273-282. [28] Belloni A, Chernozhukov V, Wang L. Square-root Lasso: Pivotal recovery of sparse signals via conic programming [J]. Biometrika, 2011, 98(4): 791-806. [29] Li X G, Zhao T, Yuan X M, et al. The flare package for high dimensional linear regression and precision matrix estimation in R? [J]. Journal of Machine Learning Research, 2015, 16: 553-557. [30] Tang P P, Wang C J, Sun D F, et al. A sparse semismooth Newton based proximal majorization-minimization algorithm for nonconvex square-root-loss regression problems [J]. Journal of Machine Learning Research, 2020, 21(226): 1-38. [31] Chu H T M, Toh K-C, Zhang Y J. On regularized square-root regression problems: Distributionally robust interpretation and fast computations [EB/OL]. [2022-10-12]. arXiv:2109.03632. [32] Li X G, Jiang H M, Haupt J, et al. On fast convergence of proximal algorithms for SQRT-Lasso optimization: Don’t worry about its nonsmooth loss function [C]//Uncertainty in Artificial Intelligence, 2020: 49-59. [33] Candes E, Tao T. The Dantzig selector: Statistical estimation when p is much larger than n [J]. The Annals of Statistics, 2007, 35(6): 2313-2351. [34] Negahban S N, Ravikumar P, Wainwright M J, et al. A unified framework for high-dimensional analysis of M-estimators with decomposable regularizers [J]. Statistical Science, 2012, 27(4): 538-557. [35] Wainwright M J. High-dimensional Statistics: A Non-asymptotic Viewpoint [M]. Cambridge: Cambridge University Press, 2019. [36] Beck A. First-order Methods in Optimization [M]. Philadelphia: SIAM, 2017. [37] Bello Cruz J Y. On proximal subgradient splitting method for minimizing the sum of two nonsmooth convex functions [J]. Set-Valued and Variational Analysis, 2017, 25(2): 245-263. [38] Rockafellar R T. Convex Analysis [M]. Princeton: Princeton University Press, 1970. [39] 刘浩洋,户将,李勇锋,等,最优化:建模、算法与理论[M].北京:高等教育出版社,2020. [40] Nocedal J, Wright S. Numerical Optimization [M]. New York: Springer, 2006. [41] Liang J W, Fadili J, Peyré G. Activity identification and local linear convergence of forwardbackward-type methods [J]. SIAM Journal on Optimization, 2017, 27(1): 408-437. [42] Vaiter S, Peyré G, Fadili J. Model consistency of partly smooth regularizers [J]. IEEE Transactions on Information Theory, 2017, 64(3): 1725-1737. [43] Hare L W, Lewis A S. Identifying active constraints via partial smoothness and prox-regularity [J]. Journal of Convex Analysis, 2004, 11(2): 251-266. [44] Wu Z M, Li C S, Li M, et al. Inertial proximal gradient methods with Bregman regularization for a class of nonconvex optimization problems [J]. Journal of Global Optimization, 2021, 79: 617-644. |
| [1] | 孙瑞卿, 张芮, 兰艳, 李伟东. 提前完工总量最大化问题的LPT算法[J]. 运筹学学报(中英文), 2025, 29(4): 249-254. |
| [2] | 宋瑞, 王依冉, 吴中明. 非精确广义不定邻近交替方向乘子法的收敛性分析[J]. 运筹学学报(中英文), 2025, 29(4): 175-190. |
| [3] | 张博, 王红雨, 高岳林. 线性乘积和规划问题的基于D.C.松弛的分支定界算法[J]. 运筹学学报(中英文), 2025, 29(4): 159-174. |
| [4] | 孙鑫, 葛冬冬, 付德生, 魏志伟, 董丰莲, 潘师畅. 针对炼油厂系统性运营优化问题的混合分布递归及分支定界算法[J]. 运筹学学报(中英文), 2025, 29(4): 141-158. |
| [5] | 谭子琳, 罗洪林. 二阶分裂算法及其应用[J]. 运筹学学报(中英文), 2025, 29(4): 121-140. |
| [6] | 王小芳, 梁治安, 高彩霞. 弱半E-凸规划问题的最优性条件[J]. 运筹学学报(中英文), 2025, 29(4): 72-82. |
| [7] | 池倩倩, 周育英. 锥约束优化问题的精确罚逼近[J]. 运筹学学报(中英文), 2025, 29(4): 61-71. |
| [8] | 包承龙, 陈昌. 关于Bregman迭代在求解朗道自由能泛函极小化问题中的研究[J]. 运筹学学报(中英文), 2025, 29(3): 243-266. |
| [9] | 周安娃, 何佳怡. 实成对完全正矩阵[J]. 运筹学学报(中英文), 2025, 29(3): 160-178. |
| [10] | 胡胜龙. 张量分解的唯一性[J]. 运筹学学报(中英文), 2025, 29(3): 34-60. |
| [11] | 郭田德, 幸天驰, 韩丛英, 孟帅. 人工智能中的生成式方法: 数学模型、优化算法及其应用[J]. 运筹学学报(中英文), 2025, 29(3): 1-33. |
| [12] | 袁柳洋, 汤梦瑶, 迟晓妮. 一类新的无参数的填充打洞函数法[J]. 运筹学学报(中英文), 2025, 29(2): 214-220. |
| [13] | 张玉茹, 张雪, 兰茹. 一类线性反问题的变尺度外推硬阈值算法[J]. 运筹学学报(中英文), 2025, 29(2): 158-174. |
| [14] | 马素霞, 高岳林, 林洪伟, 张博. 一种新的全局优化无参数填充函数方法[J]. 运筹学学报(中英文), 2025, 29(2): 141-157. |
| [15] | 刘欣恬, 朱文兴. 超图平衡二划分的离散迭代优化算法[J]. 运筹学学报(中英文), 2025, 29(2): 128-140. |
| 阅读次数 | ||||||
|
全文 |
|
|||||
|
摘要 |
|
|||||