Piecewise Linear-Quadratic (PLQ) penalties are widely used to develop models in statistical inference, signal processing, and machine learning. Common examples of PLQ penalties include least squares, Huber, Vapnik, 1-norm, and their asymmetric generalizations. Properties of these estimators depend on the choice of penalty and its shape parameters, such as degree of asymmetry for the quantile loss, and transition point between linear and quadratic pieces for the Huber function.

In this paper, we develop a statistical framework that can help the modeler to automatically tune shape parameters once the shape of the penalty has been chosen. The choice of the parameter is informed by the basic notion that each PLQ penalty should correspond to a true statistical density. The normalization constant inherent in this requirement helps to inform the optimization over shape parameters, giving a joint optimization problem over these as well as primary parameters of interest. A second contribution is to consider optimization methods for these joint problems. We show that basic first-order methods can be immediately brought to bear, and design specialized extensions of interior point (IP) methods for PLQ problems that can quickly and efficiently solve the joint problem. Synthetic problems and larger-scale practical examples illustrate the utility of the approach. Code for the new IP method is implemented using the Julia language (https://github.com/UW-AMO/shape-parameters/tree/ojmo).

Accepted:

Revised after acceptance:

Published online:

@article{OJMO_2021__2__A8_0, author = {Peng Zheng and Karthikeyan Natesan Ramamurthy and Aleksandr Y. Aravkin}, title = {Estimating {Shape} {Parameters} of {Piecewise} {Linear-Quadratic} {Problems}}, journal = {Open Journal of Mathematical Optimization}, eid = {8}, publisher = {Universit\'e de Montpellier}, volume = {2}, year = {2021}, doi = {10.5802/ojmo.10}, language = {en}, url = {https://ojmo.centre-mersenne.org/articles/10.5802/ojmo.10/} }

Peng Zheng; Karthikeyan Natesan Ramamurthy; Aleksandr Y. Aravkin. Estimating Shape Parameters of Piecewise Linear-Quadratic Problems. Open Journal of Mathematical Optimization, Volume 2 (2021), article no. 8, 18 p. doi : 10.5802/ojmo.10. https://ojmo.centre-mersenne.org/articles/10.5802/ojmo.10/

[1] Sparse/Robust Estimation and Kalman Smoothing with Nonsmooth Log-Concave Densities: Modeling, Computation, and Theory, J. Mach. Learn. Res., Volume 14 (2013), pp. 2689-2728 | MR 3121655 | Zbl 1318.62237

[2] Generalized system identification with stable spline kernels, SIAM J. Sci. Comput., Volume 40 (2018) no. 5, p. B1419-B1443 | Article | MR 3867622 | Zbl 1404.62051

[3] Efficient quadratic penalization through the partial minimization technique, IEEE Trans. Autom. Control, Volume 63 (2017) no. 7, pp. 2131-2138 | Article | MR 3820213 | Zbl 1423.49020

[4] Estimating nuisance parameters in inverse problems, Inverse Probl., Volume 28 (2012) no. 11, 115016, 13 pages | MR 2997233 | Zbl 1253.49021

[5] et al. A survey of cross-validation procedures for model selection, Stat. Surv., Volume 4 (2010), pp. 40-79 | MR 2602303 | Zbl 1190.62080

[6] Proximal alternating minimization and projection methods for nonconvex problems: An approach based on the Kurdyka–Łojasiewicz inequality, Math. Oper. Res., Volume 35 (2010) no. 2, pp. 438-457 | Article | Zbl 1214.65036

[7] A relative weighting method for estimating parameters and variances in multiple data sets, Comput. Stat. Data Anal., Volume 22 (1996) no. 2, pp. 119-135 | Article | MR 1410383 | Zbl 0900.65402

[8] Asymmetric Laplace Regression: Maximum Likelihood, Maximum Entropy and Quantile Regression, J. Econom. Methods, Volume 5 (2016) no. 1, pp. 79-101 | MR 3441120 | Zbl 1345.62169

[9] Algorithms for hyper-parameter optimization, Advances in Neural Information Processing Systems (2011), pp. 2546-2554

[10] Random search for hyper-parameter optimization, J. Mach. Learn. Res., Volume 13 (2012), pp. 281-305 | MR 2913701 | Zbl 1283.68282

[11] Proximal alternating linearized minimization for nonconvex and nonsmooth problems, Math. Program., Volume 146 (2014) no. 1-2, pp. 459-494 | Article | MR 3232623 | Zbl 1297.90125

[12] Convex optimization, Cambridge University Press, 2004 | Zbl 1058.90049

[13] Robust principal component analysis?, J. ACM, Volume 58 (2011) no. 3, 11, 37 pages | MR 2811000 | Zbl 1327.62369

[14] Sparse and low-rank matrix decompositions, IFAC Proceedings Volumes, Volume 42 (2009) no. 10, pp. 1493-1498 | Article

[15] Compressed sensing, IEEE Trans. Inf. Theory, Volume 52 (2006) no. 4, pp. 1289-1306 | Article | MR 2241189 | Zbl 1288.94016

[16] Adapting regularized low-rank models for parallel architectures, SIAM J. Sci. Comput., Volume 41 (2019) no. 1, p. A163-A189 | Article | MR 3895341 | Zbl 1405.65079

[17] Robust statistics, 523, John Wiley & Sons, 2004

[18] Sequential model-based optimization for general algorithm configuration, International Conference on Learning and Intelligent Optimization (2011), pp. 507-523 | Article

[19] Fast Laplace optimization of machine learning hyperparameters on large datasets (2016) (https://arxiv.org/abs/1605.07079)

[20] Quantile regression, J. Econom. Perspect., Volume 15 (2001) no. 4, pp. 143-156 | Article

[21] A Unified Approach to Interior Point Algorithms for Linear Complementarity Problems, Lecture Notes in Computer Science, 538, Springer, 1991 | MR 1226025

[22] Hyperband: A Novel Bandit-Based Approach to Hyperparameter Optimization (2016) (https://arxiv.org/abs/1603.06560)

[23] Statistical modeling of complex backgrounds for foreground object detection, IEEE Trans. Image Process., Volume 13 (2004) no. 11, pp. 1459-1472

[24] Interior-Point Polynomial Algorithms in Convex Programming, Studies in Applied Mathematics, 13, Society for Industrial and Applied Mathematics, 1994 | MR 1258086

[25] Iterative Solution of Symmetric Quasi-Definite Linear Systems, SIAM Spotlights, 3, Society for Industrial and Applied Mathematics, 2017 | MR 3660695 | Zbl 1409.65004

[26] RASL: Robust alignment by sparse and low-rank decomposition for linearly correlated images, IEEE Trans. Pattern Anal. Mach. Intell., Volume 34 (2012) no. 11, pp. 2233-2246 | Article

[27] Variational analysis, 317, Springer, 2009

[28] Practical Bayesian optimization of machine learning algorithms, Advances in neural information processing systems (2012), pp. 2951-2959

[29] Regression shrinkage and selection via the lasso, J. R. Stat. Soc., Ser. B, Stat. Methodol., Volume 58 (1996) no. 1, pp. 267-288 | MR 1379242 | Zbl 0850.62538

[30] Bayesian variable selection and estimation in maximum entropy quantile regression, J. Appl. Stat., Volume 44 (2017) no. 2, pp. 253-269 | Article | MR 3607735 | Zbl 07282037

[31] Face recognition using eigenfaces, Computer Vision and Pattern Recognition (1991), pp. 586-591 | Article

[32] Primal-Dual Interior-Point Methods, Society for Industrial and Applied Mathematics, 1997 | Zbl 0863.65031

[33] Bayesian quantile regression, Stat. Probab. Lett., Volume 54 (2001) no. 4, pp. 437-447 | MR 1861390 | Zbl 0983.62017

[34] Tilt: Transform invariant low-rank textures, Int. J. Comput. Vision, Volume 99 (2012) no. 1, pp. 1-24 | Article | MR 2917017 | Zbl 1254.68290

*Cited by document(s). Sources: *