[1] Hocking R R. The analysis and selection of variables in linear regression [J]. Biometrics, 1976, 32: 1-49. [2] Kira K, Rendell L A. The feature selection problem: Traditonal methods and a new algorithm [C]//Proceedings of the AAAI Conference on Artificial Intelligence, 1992, 10: 129-134. [3] Akaike H. Information theory and an extension of the maximum likelihood principle [M]//Selected Papers of Hirotugu Akaike, New York: Springer, 1998: 610-624. [4] Schwarz G. Estimating the dimension of a model [J]. The Annals of Statistics, 1978, 6(2): 461-464. [5] Tibshirani R. Regression shrinkage and selection via the lasso [J]. Journal of the Royal Statistical Society Series B-Statistical Methodology, 1996, 58(1): 267-288. [6] Mohri M, Rostamizadeh A, Talwalkar A. Foundations of Machine Learning [M]. Cambridge: MIT Press, 2018. [7] Donoho D L. Compressed sensing [J]. IEEE Transactions on Information Theory, 2006, 52(4): 1289-1306. [8] Liu X, Shen C G, Wang L. A dual active-set proximal Newton algorithm for sparse approximation of correlation matrices [J]. Optimization Methods and Software, 2022, 37(5): 1820-1844. [9] Shen C G, Xue W J, Zhang L-H, et al. An active-set proximal Newton algorithm for ‘1- regularized optimization problems with box constraints [J]. Journal of Scientific Computing, 2020, 85: 1-34. [10] Beck A, Teboulle M. A fast iterative shrinkage-thresholding algorithm for linear inverse problems [J]. SIAM Journal on Imaging Sciences, 2009, 2(1): 183-202. [11] Becker S, Bobin J, Candes E J. Nesta: A fast and accurate first-order method for sparse recovery [J]. SIAM Journal on Imaging Sciences, 2011, 4(1): 1-39. [12] Van Den Berg E, Friedlander M P. Probing the Pareto frontier for basis pursuit solutions [J]. SIAM Journal on Scientific Computing, 2009, 31(2): 890-912. [13] Byrd R H, Chin G M, Nocedal J, et al. A family of second-order methods for convex ‘1- regularized optimization [J]. Mathematical Programming Series A, 2016, 159(1): 435-467. [14] Efron B, Hastie T, Johnstone I, et al. Least angle regression [J]. The Annals of Statistics, 2004, 32(2): 407-499. [15] Figueiredo M A T, Nowak R D, Wright S J. Gradient projection for sparse reconstruction: Application to compressed sensing and other inverse problems [J]. IEEE Journal of Selected Topics in Signal Processing, 2007, 1(4): 586-597. [16] Hale E T, Yin W, Zhang Y. Fixed-point continuation for ‘1-minimization: Methodology and convergence [J]. SIAM Journal on Optimization, 2008, 19(3): 1107-1130. [17] Keskar N, Nocedal J, Oztoprak F, et al. A second-order method for convex ‘1-regularized optimization with active-set prediction [J]. Optimization Methods and Software, 2016, 31(3): 605-621. [18] Li X D, Sun D F, Toh K-C. A highly efficient semismooth Newton augmented Lagrangian method for solving Lasso problems [J]. SIAM Journal on Optimization, 2018, 28(1): 433-458. [19] Milzarek A, Ulbrich M. A semismooth Newton method with multidimensional filter globalization for ‘1-optimization [J]. SIAM Journal on Optimization, 2014, 24(1): 298-333. [20] Wen Z W, Yin W T, Goldfarb D, et al. A fast algorithm for sparse reconstruction based on shrinkage, subspace optimization, and continuation [J]. SIAM Journal on Scientific Computing, 2010, 32(4): 1832-1857. [21] Wright S J, Nowak R D, Figueiredo M A T. Sparse reconstruction by separable approximation [J]. IEEE Transactions on Signal Processing, 2009, 57(7): 2479-2493. [22] Xiao X T, Li Y F, Wen Z W, et al. A regularized semi-smooth Newton method with projection steps for composite convex programs [J]. Journal of Scientific Computing, 2018, 76(1): 364- 389. [23] Yuan M, Lin Y. Model selection and estimation in regression with grouped variables [J]. Journal of the Royal Statistical Society Series B-Statistical Methodology, 2006, 68(1): 49-67. [24] Li X D, Sun D F, Toh K-C. On efficiently solving the subproblems of a level-set method for fused Lasso problems [J]. SIAM Journal on Optimization, 2018, 28(2): 1842-1866. [25] Zhang Y J, Zhang N, Sun D F, et al. An efficient Hessian based algorithm for solving large-scale sparse group Lasso problems [J]. Mathematical Programming, 2020, 179(1): 223-263. [26] Zhang Y J, Zhang N, Sun D F, et al. A proximal point dual Newton algorithm for solving group graphical Lasso problems [J]. SIAM Journal on Optimization, 2020, 30(3): 2197-2220. [27] Tibshirani R. Regression shrinkage and selection via the Lasso: A retrospective [J]. Journal of the Royal Statistical Society Series B-Statistical Methodology, 2011, 73(3): 273-282. [28] Belloni A, Chernozhukov V, Wang L. Square-root Lasso: Pivotal recovery of sparse signals via conic programming [J]. Biometrika, 2011, 98(4): 791-806. [29] Li X G, Zhao T, Yuan X M, et al. The flare package for high dimensional linear regression and precision matrix estimation in R? [J]. Journal of Machine Learning Research, 2015, 16: 553-557. [30] Tang P P, Wang C J, Sun D F, et al. A sparse semismooth Newton based proximal majorization-minimization algorithm for nonconvex square-root-loss regression problems [J]. Journal of Machine Learning Research, 2020, 21(226): 1-38. [31] Chu H T M, Toh K-C, Zhang Y J. On regularized square-root regression problems: Distributionally robust interpretation and fast computations [EB/OL]. [2022-10-12]. arXiv:2109.03632. [32] Li X G, Jiang H M, Haupt J, et al. On fast convergence of proximal algorithms for SQRT-Lasso optimization: Don’t worry about its nonsmooth loss function [C]//Uncertainty in Artificial Intelligence, 2020: 49-59. [33] Candes E, Tao T. The Dantzig selector: Statistical estimation when p is much larger than n [J]. The Annals of Statistics, 2007, 35(6): 2313-2351. [34] Negahban S N, Ravikumar P, Wainwright M J, et al. A unified framework for high-dimensional analysis of M-estimators with decomposable regularizers [J]. Statistical Science, 2012, 27(4): 538-557. [35] Wainwright M J. High-dimensional Statistics: A Non-asymptotic Viewpoint [M]. Cambridge: Cambridge University Press, 2019. [36] Beck A. First-order Methods in Optimization [M]. Philadelphia: SIAM, 2017. [37] Bello Cruz J Y. On proximal subgradient splitting method for minimizing the sum of two nonsmooth convex functions [J]. Set-Valued and Variational Analysis, 2017, 25(2): 245-263. [38] Rockafellar R T. Convex Analysis [M]. Princeton: Princeton University Press, 1970. [39] 刘浩洋,户将,李勇锋,等,最优化:建模、算法与理论[M].北京:高等教育出版社,2020. [40] Nocedal J, Wright S. Numerical Optimization [M]. New York: Springer, 2006. [41] Liang J W, Fadili J, Peyré G. Activity identification and local linear convergence of forwardbackward-type methods [J]. SIAM Journal on Optimization, 2017, 27(1): 408-437. [42] Vaiter S, Peyré G, Fadili J. Model consistency of partly smooth regularizers [J]. IEEE Transactions on Information Theory, 2017, 64(3): 1725-1737. [43] Hare L W, Lewis A S. Identifying active constraints via partial smoothness and prox-regularity [J]. Journal of Convex Analysis, 2004, 11(2): 251-266. [44] Wu Z M, Li C S, Li M, et al. Inertial proximal gradient methods with Bregman regularization for a class of nonconvex optimization problems [J]. Journal of Global Optimization, 2021, 79: 617-644. |