Estimating Shape Parameters of Piecewise Linear-Quadratic Problems

Peng Zheng; Karthikeyan Natesan Ramamurthy; Aleksandr Y. Aravkin

doi:10.5802/ojmo.10

Peng Zheng ¹; Karthikeyan Natesan Ramamurthy ²; Aleksandr Y. Aravkin ³

¹ Department of Health Metrics Sciences, University of Washington Seattle, WA, USA
² IBM Thomas J. Watson Research Center Yorktown Heights, NY, USA
³ Department of Applied Mathematics, University of Washington Seattle, WA, USA

Open Journal of Mathematical Optimization, Volume 2 (2021), article no. 8, 18 p.

Abstract

Piecewise Linear-Quadratic (PLQ) penalties are widely used to develop models in statistical inference, signal processing, and machine learning. Common examples of PLQ penalties include least squares, Huber, Vapnik, 1-norm, and their asymmetric generalizations. Properties of these estimators depend on the choice of penalty and its shape parameters, such as degree of asymmetry for the quantile loss, and transition point between linear and quadratic pieces for the Huber function.

In this paper, we develop a statistical framework that can help the modeler to automatically tune shape parameters once the shape of the penalty has been chosen. The choice of the parameter is informed by the basic notion that each PLQ penalty should correspond to a true statistical density. The normalization constant inherent in this requirement helps to inform the optimization over shape parameters, giving a joint optimization problem over these as well as primary parameters of interest. A second contribution is to consider optimization methods for these joint problems. We show that basic first-order methods can be immediately brought to bear, and design specialized extensions of interior point (IP) methods for PLQ problems that can quickly and efficiently solve the joint problem. Synthetic problems and larger-scale practical examples illustrate the utility of the approach. Code for the new IP method is implemented using the Julia language (https://github.com/UW-AMO/shape-parameters/tree/ojmo).

Received: 2020-12-30
Accepted: 2021-08-30
Revised after acceptance: 2021-08-31
Published online: 2021-09-23

DOI: 10.5802/ojmo.10

Author's affiliations:

Peng Zheng ¹; Karthikeyan Natesan Ramamurthy ²; Aleksandr Y. Aravkin ³

License:

CC-BY 4.0

Copyrights: The authors retain unrestricted copyrights and publishing rights

@article{OJMO_2021__2__A8_0,
     author = {Peng Zheng and Karthikeyan Natesan Ramamurthy and Aleksandr Y. Aravkin},
     title = {Estimating {Shape} {Parameters} of {Piecewise} {Linear-Quadratic} {Problems}},
     journal = {Open Journal of Mathematical Optimization},
     eid = {8},
     pages = {1--18},
     publisher = {Universit\'e de Montpellier},
     volume = {2},
     year = {2021},
     doi = {10.5802/ojmo.10},
     language = {en},
     url = {https://ojmo.centre-mersenne.org/articles/10.5802/ojmo.10/}
}

TY  - JOUR
AU  - Peng Zheng
AU  - Karthikeyan Natesan Ramamurthy
AU  - Aleksandr Y. Aravkin
TI  - Estimating Shape Parameters of Piecewise Linear-Quadratic Problems
JO  - Open Journal of Mathematical Optimization
PY  - 2021
SP  - 1
EP  - 18
VL  - 2
PB  - Université de Montpellier
UR  - https://ojmo.centre-mersenne.org/articles/10.5802/ojmo.10/
DO  - 10.5802/ojmo.10
LA  - en
ID  - OJMO_2021__2__A8_0
ER  -

%0 Journal Article
%A Peng Zheng
%A Karthikeyan Natesan Ramamurthy
%A Aleksandr Y. Aravkin
%T Estimating Shape Parameters of Piecewise Linear-Quadratic Problems
%J Open Journal of Mathematical Optimization
%D 2021
%P 1-18
%V 2
%I Université de Montpellier
%U https://ojmo.centre-mersenne.org/articles/10.5802/ojmo.10/
%R 10.5802/ojmo.10
%G en
%F OJMO_2021__2__A8_0

Peng Zheng; Karthikeyan Natesan Ramamurthy; Aleksandr Y. Aravkin. Estimating Shape Parameters of Piecewise Linear-Quadratic Problems. Open Journal of Mathematical Optimization, Volume 2 (2021), article  no. 8, 18 p. doi : 10.5802/ojmo.10. https://ojmo.centre-mersenne.org/articles/10.5802/ojmo.10/

References
Cited by

[1] Aleksandr Y. Aravkin; James V. Burke; Gianluigi Pillonetto Sparse/Robust Estimation and Kalman Smoothing with Nonsmooth Log-Concave Densities: Modeling, Computation, and Theory, J. Mach. Learn. Res., Volume 14 (2013), pp. 2689-2728 | MR | Zbl

[2] Aleksandr Y. Aravkin; James V. Burke; Gianluigi Pillonetto Generalized system identification with stable spline kernels, SIAM J. Sci. Comput., Volume 40 (2018) no. 5, p. B1419-B1443 | DOI | MR | Zbl

[3] Aleksandr Y. Aravkin; Dmitriy Drusvyatskiy; Tristan van Leeuwen Efficient quadratic penalization through the partial minimization technique, IEEE Trans. Autom. Control, Volume 63 (2017) no. 7, pp. 2131-2138 | DOI | MR | Zbl

[4] Aleksandr Y. Aravkin; Tristan Van Leeuwen Estimating nuisance parameters in inverse problems, Inverse Probl., Volume 28 (2012) no. 11, 115016, 13 pages | MR | Zbl

[5] Sylvain Arlot; Alain Celisse et al. A survey of cross-validation procedures for model selection, Stat. Surv., Volume 4 (2010), pp. 40-79 | MR | Zbl

[6] Hédy Attouch; Jérôme Bolte; Patrick Redont; Antoine Soubeyran Proximal alternating minimization and projection methods for nonconvex problems: An approach based on the Kurdyka–Łojasiewicz inequality, Math. Oper. Res., Volume 35 (2010) no. 2, pp. 438-457 | DOI | Zbl

[7] Bradley M. Bell; James V. Burke; Alan Schumitzky A relative weighting method for estimating parameters and variances in multiple data sets, Comput. Stat. Data Anal., Volume 22 (1996) no. 2, pp. 119-135 | DOI | MR | Zbl

[8] Anil K. Bera; Antonio F. Galvao; Gabriel V. Montes-Rojas; Sung Y. Park Asymmetric Laplace Regression: Maximum Likelihood, Maximum Entropy and Quantile Regression, J. Econom. Methods, Volume 5 (2016) no. 1, pp. 79-101 | MR | Zbl

[9] James S. Bergstra; Rémi Bardenet; Yoshua Bengio; Balázs Kégl Algorithms for hyper-parameter optimization, Advances in Neural Information Processing Systems (2011), pp. 2546-2554

[10] James S. Bergstra; Yoshua Bengio Random search for hyper-parameter optimization, J. Mach. Learn. Res., Volume 13 (2012), pp. 281-305 | MR | Zbl

[11] Jérôme Bolte; Shoham Sabach; Marc Teboulle Proximal alternating linearized minimization for nonconvex and nonsmooth problems, Math. Program., Volume 146 (2014) no. 1-2, pp. 459-494 | DOI | MR | Zbl

[12] Stephen Boyd; Lieven Vandenberghe Convex optimization, Cambridge University Press, 2004 | Zbl

[13] Emmanuel J Candès; Xiaodong Li; Yi Ma; John Wright Robust principal component analysis?, J. ACM, Volume 58 (2011) no. 3, 11, 37 pages | MR | Zbl

[14] Venkat Chandrasekaran; Sujay Sanghavi; Pablo A. Parrilo; Alan S. Willsky Sparse and low-rank matrix decompositions, IFAC Proceedings Volumes, Volume 42 (2009) no. 10, pp. 1493-1498 | DOI | Zbl

[15] David L Donoho Compressed sensing, IEEE Trans. Inf. Theory, Volume 52 (2006) no. 4, pp. 1289-1306 | DOI | MR | Zbl

[16] Derek Driggs; Stephen Becker; Aleksandr Y. Aravkin Adapting regularized low-rank models for parallel architectures, SIAM J. Sci. Comput., Volume 41 (2019) no. 1, p. A163-A189 | DOI | MR | Zbl

[17] Peter J. Huber Robust statistics, 523, John Wiley & Sons, 2004 | Zbl

[18] Frank Hutter; Holger H. Hoos; Kevin Leyton-Brown Sequential model-based optimization for general algorithm configuration, International Conference on Learning and Intelligent Optimization (2011), pp. 507-523 | DOI | Zbl

[19] Aaron Klein; Stefan Falkner; Simon Bartels; Philipp Hennig; Frank Hutter Fast Laplace optimization of machine learning hyperparameters on large datasets (2016) (https://arxiv.org/abs/1605.07079) | Zbl

[20] Roger Koenker; Kevin F. Hallock Quantile regression, J. Econom. Perspect., Volume 15 (2001) no. 4, pp. 143-156 | DOI

[21] Masakazu Kojima; Nimrod Megiddo; Toshihito Noma; Akiko Yoshise A Unified Approach to Interior Point Algorithms for Linear Complementarity Problems, Lecture Notes in Computer Science, 538, Springer, 1991 | MR | Zbl

[22] Lisha Li; Kevin Jamieson; Giulia DeSalvo; Afshin Rostamizadeh; Ameet Talwalkar Hyperband: A Novel Bandit-Based Approach to Hyperparameter Optimization (2016) (https://arxiv.org/abs/1603.06560) | Zbl

[23] Liyuan Li; Weimin Huang; Irene Yu-Hua Gu; Qi Tian Statistical modeling of complex backgrounds for foreground object detection, IEEE Trans. Image Process., Volume 13 (2004) no. 11, pp. 1459-1472

[24] Arkadi Nemirovski; Yurii Nesterov Interior-Point Polynomial Algorithms in Convex Programming, Studies in Applied Mathematics, 13, Society for Industrial and Applied Mathematics, 1994 | MR | Zbl

[25] Dominique Orban; Mario Arioli Iterative Solution of Symmetric Quasi-Definite Linear Systems, SIAM Spotlights, 3, Society for Industrial and Applied Mathematics, 2017 | MR | Zbl

[26] Yigang Peng; Arvind Ganesh; John Wright; Wenli Xu; Yi Ma RASL: Robust alignment by sparse and low-rank decomposition for linearly correlated images, IEEE Trans. Pattern Anal. Mach. Intell., Volume 34 (2012) no. 11, pp. 2233-2246 | DOI

[27] R. Tyrrell Rockafellar; Roger J.-B. Wets Variational analysis, 317, Springer, 2009 | Zbl

[28] Jasper Snoek; Hugo Larochelle; Ryan P. Adams Practical Bayesian optimization of machine learning algorithms, Advances in neural information processing systems (2012), pp. 2951-2959 | Zbl

[29] Robert Tibshirani Regression shrinkage and selection via the lasso, J. R. Stat. Soc., Ser. B, Stat. Methodol., Volume 58 (1996) no. 1, pp. 267-288 | MR | Zbl

[30] Shiyi Tu; Min Wang; Xiaoqian Sun Bayesian variable selection and estimation in maximum entropy quantile regression, J. Appl. Stat., Volume 44 (2017) no. 2, pp. 253-269 | DOI | MR | Zbl

[31] Matthew A. Turk; Alex P. Pentland Face recognition using eigenfaces, Computer Vision and Pattern Recognition (1991), pp. 586-591 | DOI

[32] Stephen J. Wright Primal-Dual Interior-Point Methods, Society for Industrial and Applied Mathematics, 1997 | Zbl

[33] Keming Yu; Rana A. Moyeed Bayesian quantile regression, Stat. Probab. Lett., Volume 54 (2001) no. 4, pp. 437-447 | MR | Zbl

[34] Zhengdong Zhang; Arvind Ganesh; Xiao Liang; Yi Ma Tilt: Transform invariant low-rank textures, Int. J. Comput. Vision, Volume 99 (2012) no. 1, pp. 1-24 | DOI | MR | Zbl

Cited by Sources: