While there already exist randomized subspace Newton methods that restrict the search direction to a random subspace for a convex function, we propose a randomized subspace regularized Newton method for a non-convex function and more generally we investigate thoroughly, for the first time, the local convergence rate of the randomized subspace Newton method. In our proposed algorithm, we use a modified Hessian of the function restricted to some random subspace so that, with high probability, the function value decreases at each iteration, even when the objective function is non-convex. We show that our method has global convergence under appropriate assumptions and its convergence rate is the same as that of the full regularized Newton method. Furthermore, we obtain a local linear convergence rate under some additional assumptions, and prove that this rate is the best we can hope, in general, when using a random subspace. We furthermore prove that if the Hessian, at the local optimum, is rank deficient then super-linear convergence holds.
Revised:
Accepted:
Published online:
Terunari Fuji 1; Pierre-Louis Poirion 2; Akiko Takeda 3
CC-BY 4.0
@article{OJMO_2025__6__A9_0,
author = {Terunari Fuji and Pierre-Louis Poirion and Akiko Takeda},
title = {Theoretical analysis of the randomized subspace regularized {Newton} method for non-convex optimization},
journal = {Open Journal of Mathematical Optimization},
eid = {8},
pages = {1--35},
year = {2025},
publisher = {Universit\'e de Montpellier},
volume = {6},
doi = {10.5802/ojmo.46},
language = {en},
url = {https://ojmo.centre-mersenne.org/articles/10.5802/ojmo.46/}
}
TY - JOUR AU - Terunari Fuji AU - Pierre-Louis Poirion AU - Akiko Takeda TI - Theoretical analysis of the randomized subspace regularized Newton method for non-convex optimization JO - Open Journal of Mathematical Optimization PY - 2025 SP - 1 EP - 35 VL - 6 PB - Université de Montpellier UR - https://ojmo.centre-mersenne.org/articles/10.5802/ojmo.46/ DO - 10.5802/ojmo.46 LA - en ID - OJMO_2025__6__A9_0 ER -
%0 Journal Article %A Terunari Fuji %A Pierre-Louis Poirion %A Akiko Takeda %T Theoretical analysis of the randomized subspace regularized Newton method for non-convex optimization %J Open Journal of Mathematical Optimization %D 2025 %P 1-35 %V 6 %I Université de Montpellier %U https://ojmo.centre-mersenne.org/articles/10.5802/ojmo.46/ %R 10.5802/ojmo.46 %G en %F OJMO_2025__6__A9_0
Terunari Fuji; Pierre-Louis Poirion; Akiko Takeda. Theoretical analysis of the randomized subspace regularized Newton method for non-convex optimization. Open Journal of Mathematical Optimization, Volume 6 (2025), article no. 8, 35 p.. doi: 10.5802/ojmo.46
[1] A note on the norm of oblique projections, Appl. Math. E-Notes, Volume 14 (2014), pp. 43-44 | Zbl
[2] A general and adaptive robust loss function, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), IEEE (2019), pp. 4331-4339 | DOI
[3] Random search for hyper-parameter optimization, J. Mach. Learn. Res., Volume 13 (2012) no. 2, p. 381-305 | Zbl | MR
[4] Optimization methods for large-scale machine learning, SIAM Rev., Volume 60 (2018) no. 2, pp. 223-311 | DOI | MR | Zbl
[5] Bound-constrained global optimization of functions with low effective dimensionality using multiple random embeddings, Math. Program., Volume 198 (2023) no. 1(A), pp. 997-1058 | Zbl | MR | DOI
[6] Global optimization using random embeddings, Math. Program., Volume 200 (2023), pp. 781-829 | DOI | MR | Zbl
[7] A dimensionality reduction technique for unconstrained global optimization of functions with low effective dimensionality, Inf. Inference, Volume 11 (2021) no. 1, pp. 167-201 | MR | Zbl
[8] Coderivative-Based Newton Methods with Wolfe Linesearch for Nonsmooth Optimization (2024) | arXiv | Zbl
[9] Randomized fast subspace descent methods (2020) | arXiv | Zbl
[10] Active subspace methods in theory and practice: applications to kriging surfaces, SIAM J. Sci. Comput., Volume 36 (2014) no. 4, p. A1500-A1524 | DOI | MR | Zbl
[11] A Randomised Subspace Gauss-Newton Method for Nonlinear Least-Squares, Thirty-seventh International Conference on Machine Learning, 2020. Workshop on Beyond First Order Methods in ML Systems, 2020
[12] Randomised subspace methods for non-convex optimization, with applications to nonlinear least-squares (2022) | arXiv | Zbl
[13] The MNIST database of handwritten digit images for machine learning research, IEEE Signal Process. Mag., Volume 29 (2012) no. 6, pp. 141-142 | DOI
[14] Super-Universal Regularized Newton Method, SIAM J. Optim., Volume 34 (2024) no. 1, pp. 27-56 | DOI | MR | Zbl
[15] UCI Machine Learning Repository, 2017 (University of California, Irvine, School of Information and Computer Sciences, http://archive.ics.uci.edu/ml)
[16] Learning functions of few arbitrary linear parameters in high dimensions, Found. Comput. Math., Volume 12 (2012), pp. 229-262 | DOI | MR | Zbl
[17] Bayesian optimization for policy search in high-dimensional systems via automatic domain selection, 2019 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), IEEE (2019), pp. 757-764 | DOI
[18] RSN: Randomized Subspace Newton, Adv. Neural Inf. Process. Syst., Volume 32 (2019), pp. 616-625
[19] Yet another fast variant of Newton’s method for nonconvex optimization, IMA J. Numer. Anal., Volume 45 (2025) no. 2, pp. 971-1008 | DOI | MR
[20] Proximal gradient methods with adaptive subspace sampling, Math. Oper. Res., Volume 46 (2021) no. 4, pp. 1303-1323 | DOI | MR | Zbl
[21] Gradient descent happens in a tiny subspace (2018) | arXiv
[22] Stochastic subspace cubic Newton method, Proceedings of the 37th International Conference on Machine Learning (Hal Daumé III; Aarti Singh, eds.) (Proceedings of Machine Learning Research), Volume 119, PMLR (2020), pp. 4027-4038
[23] An efficient approach for assessing hyperparameter importance, Proceedings of the 31st International Conference on Machine Learning, PMLR (2014), pp. 754-762
[24] Holderian Error Bounds and K.L. Inequality for the Trust Region Subproblem, Math. Oper. Res., Volume 47 (2022) no. 4, pp. 3025-3050 | DOI | Zbl | MR
[25] Extensions of Lipschitz mappings into a Hilbert space, Conference in Modern Analysis and Probability (G. Hedlund, ed.) (Contemporary Mathematics), Volume 26, American Mathematical Society (1984), pp. 189-206 | DOI | Zbl
[26] Association of parameter, software, and hardware variation with large-scale behavior across 57,000 climate models, Proc. Natl. Acad. Sci. USA, Volume 104 (2007) no. 30, pp. 12259-12264 | DOI
[27] Fast linear convergence of randomized BFGS (2020) | arXiv | Zbl
[28] A stochastic subspace approach to gradient-free optimization in high dimensions, Comput. Optim. Appl., Volume 79 (2021) no. 2, pp. 339-368 | DOI | Zbl | MR
[29] A stochastic subspace approach to gradient-free optimization in high dimensions, Comput. Optim. Appl., Volume 79 (2021) no. 2, pp. 339-368 | DOI | MR | Zbl
[30] High-Dimensional Optimization in Adaptive Random Subspaces, Curran Associates, Inc. (2019), pp. 10847-10857
[31] A globally convergent proximal Newton-type method in nonsmooth convex optimization, Math. Program., Volume 198 (2023) no. 1, pp. 899-936 | DOI | MR | Zbl
[32] Cubic regularization of Newton method and its global performance, Math. Program., Volume 108 (2006), pp. 177-205 | DOI | MR | Zbl
[33] The full spectrum of deepnet hessians at scale: Dynamics with sgd training and sample size (2018) | arXiv
[34] Direct search based on probabilistic descent in reduced spaces, SIAM J. Optim., Volume 33 (2023) no. 4, pp. 3057-3082 | DOI | MR | Zbl
[35] Smallest singular value of a random rectangular matrix, Commun. Pure Appl. Math., Volume 62 (2009) no. 12, pp. 1707-1739 | DOI | MR | Zbl
[36] Empirical analysis of the hessian of over-parametrized neural networks (2017) | arXiv
[37] On random embeddings and their application to optimisation (2022) | arXiv
[38] Optimization of Convex Functions with Random Pursuit, SIAM J. Optim., Volume 23 (2013) no. 2, pp. 1284-1309 | DOI | MR | Zbl
[39] Convergence properties of the regularized Newton method for the unconstrained nonconvex optimization, Appl. Math. Optim., Volume 62 (2010) no. 1, pp. 27-46 | DOI | MR | Zbl
[40] A regularized Newton method without line search for unconstrained optimization, Comput. Optim. Appl., Volume 59 (2014) no. 1-2, pp. 321-351 | DOI | MR | Zbl
[41] High-dimensional probability: An introduction with applications in data science, Cambridge Series in Statistical and Probabilistic Mathematics, 47, Cambridge University Press, 2018 | Zbl | MR
[42] Bayesian optimization in a billion dimensions via random embeddings, J. Artif. Intell. Res., Volume 55 (2016), pp. 361-387 | DOI | MR | Zbl
[43] Coordinate descent algorithms, Math. Program., Volume 151 (2015) no. 1, pp. 3-34 | DOI | MR | Zbl
[44] Second-order optimization for non-convex machine learning: an empirical study, Proceedings of the 2020 SIAM International Conference on Data Mining (SDM), 2020, pp. 199-207
[45] Convergence analysis of a regularized Newton method with generalized regularization terms for unconstrained convex optimization problems, Appl. Math. Comput., Volume 491 (2025), 129219 | DOI | MR | Zbl
[46] Efficient second-order methods for non-convex optimization and machine learning, Ph. D. Thesis, UC Berkeley (2021) (https://escholarship.org/uc/item/0431q1ws)
[47] A Regularized Newton Method for Nonconvex Optimization with Global and Local Complexity Guarantees (2025) | arXiv
[48] A hybrid inexact regularized Newton and negative curvature method, Comput. Optim. Appl., Volume 88 (2024) no. 3, pp. 849-870 | MR | Zbl
Cited by Sources:
