Tight Computationally Efficient Approximation of Matrix Norms with Applications

We address the problems of computing operator norms of matrices induced by given norms on the argument and the image space. It is known that aside of a fistful of"solvable cases,"most notably, the case when both given norms are Euclidean, computing operator norm of a matrix is NP-hard. We specify rather general families of norms on the argument and the images space ("ellitopic"and"co-ellitopic,"respectively) allowing for reasonably tight computationally efficient upper-bounding of the associated operator norms. We extend these results to bounding"robust operator norm of uncertain matrix with box uncertainty,"that is, the maximum of operator norms of matrices representable as a linear combination, with coefficients of magnitude $\leq1$, of a collection of given matrices. Finally, we consider some applications of norm bounding, in particular, (1) computationally efficient synthesis of affine non-anticipative finite-horizon control of discrete time linear dynamical systems under bounds on the peak-to-peak gains, (2) signal recovery with uncertainties in sensing matrix, and (3) identification of parameters of time invariant discrete time linear dynamical systems via noisy observations of states and inputs on a given time horizon, in the case of"uncertain-but-bounded"noise varying in a box.


Introduction
Applications motivating our interest in these problems will be discussed later; we start with outlining the research status of these problems as "academic entities" and our related results.
Aside of few special cases, e.g., the case of the spectral norm (X and B are unit Euclidean balls in the respective spaces), A is NP-hard; this is so, e.g., when · X = · p , · B = · r , and p ≥ 2 ≥ r ≥ 1 with p = r [41]. (12) is NP-hard already when B, X are unit Euclidean balls, A nom = 0, and A s are restricted to be symmetric matrices of rank 2 [7]. Hardness of A, B makes it natural to look for efficiently computable reasonably tight upper bounds on the norms in question. Below we build these bounds for the case where X and the polar B * of B are ellitopes (see Section 2.1 for the corresponding definitions).
Sufficient for our current purposes example of an ellitope in R k is a bounded set Z cut of R k by a convex constraint on the vector [z T P 1 z; . . . ; z T P J z] of values of convex homogeneous quadratic forms of z: where P j 0, j P j 0, and T is a convex compact subset of R J + with a nonempty interior which is monotone, i.e., 0 ≤ t ≤ t ∈ T implies that t ∈ T . A simple example of the ellitope is the intersection of finitely many ellipsoids/elliptic cylinders centered at the origin.
We demonstrate that in the ellitopic case one can build efficiently computable upper bounds Φ(A) on A B,X and Ψ(A 1 , . . . , A N ) on A B,X which are convex in A, resp., in (A 1 , . . . , A N ), such that where K and L are ellitopic sizes (numbers of quadratic forms in the description) of X and B * , κ is the maximum of ranks of A i , and ϑ( · ) is a certain universal function of κ.
Our principal motivation for Problem A comes from control and is the necessity to handle peak-to-peak design specifications in synthesis of linear controllers. Specifically, given a linear dynamical system with states x t , controls u t , observed outputs y t , and external disturbances d t , we want to build an affine nonanticipating controller u t = g t + t τ =0 G t τ y τ in such a way that the trajectory w N = {x t , 1 ≤ t ≤ N ; y t , u t , 0 ≤ t < N } of the closed loop system on a given time horizon satisfies a given set of design specifications. With smart nonlinear reparameterization of affine non-anticipating controllers (passing from affine output-based control to the control which is affine in purified outputs, see [22] and references therein), the system trajectory becomes affine function of the initial state z and the sequence d N = [d 0 ; . . . ; d N −1 ] of external disturbances, with the matrices and constant terms in these functions being affine in the vector χ of controller's parameters varying in certain R ν . Bi-affinity of w N in (d N , z) and in χ is the key to computationally efficient processing of design specifications of appropriate structure. In this paper, we address an important (and considered as difficult in control) specification, namely, peak-to-peak gain defined as follows. 2 Let us fix some a norm · (d) on the space where the disturbances d t live, and norm · (x) on the space where the states x t live. We equip the space D N of disturbance sequences d N = [d 0 ; . . . ; d N −1 ] with the norm d N d,∞ = max t d t (d) , and the space X N d of state trajectories x N = [x 1 ; . . . ; x N ] with the norm x N x,∞ = max t x t (x) . With affine in purified outputs controller χ, x N is an affine function of d N and z; let X[χ] be the matrix of coefficients at d N in this affine dependence. Peak-to-peak disturbance-to-state gain stemming from · (d) and · (x) is, by definition, the norm of X [χ] induced by the norms d N d,∞ and x N x,∞ , and the corresponding design specification is just an upper bound on this gain. Since X[χ], as was already mentioned, is affine in χ, this specification is a convex constraint on χ. However, this constraint can be difficult to handle because the operator norm in question is typically difficult to compute (this is so already when · (d) and · (x) are · 2 -norms). In such case, we can utilize our results on Problem A to safely approximate the design specification in question by replacing difficult-to-compute induced norm of X = X[χ] by its efficiently computable convex in X and reasonably tight upper bound, as explained in details in Section 3.3.3.
Our main motivating application for Problem B is identification of parameters A of discrete time linear time invariant dynamical system from corrupted by noise observations of states x 0 , . . . , x N and inputs r 0 , . . . , r N −1 on a given time horizon. We focus on the case of uncertain-but-bounded noise, in which deviations of entries in observations from the actual values of the corresponding entries in x t and r t are bounded in magnitude. We discuss an approach (to the best of our knowledge, new), heavily utilizing our results on Problem B, for computationally efficient identification of A and for generating on-line upper bounds on recovery errors.
Note that there is some literature on the first, and huge literature on the second of the just outlined applications. Instead of positioning our results with respect to this literature in the introduction, we find it more productive to postpone this positioning till appropriate parts of the main body of the paper.
Structure of the paper is as follows. Section 2 presents background on ellitopes. Section 3 is devoted to Problem A, and Section 4-to Problem B. Technical proofs are relegated to the appendix, where we present additional results on system identification, same as describe how our results can be extended from ellitopes to an essentially wider family of sets-spectratopes.

Preliminaries: ellitopes and spectratopes
Ellitopes and their extensions, spectratopes, introduced in [21], are convex compact sets well-suited for tight upper-bounding maxima of quadratic forms over the sets. To make the paper more readable, in its main body we focus on ellitopes; (always straightforward) extensions to spectratopes are relegated to Appendix.

Ellitopes: definition and basic examples
A basic ellitope is a set W represented as where T k 0, k ≤ K, k T k 0, and T is a convex computationally tractable compact monotone subset of R K + with int T = ∅, monotonicity meaning that when 0 ≤ t ≤ t and t ∈ T , we have t ∈ T as well.
An ellitope X is a linear image of a basic ellitope: X = P W = {x ∈ R n : ∃ w ∈ W : x = P w} with W given by (2) (3) We call K ellitopic size of ellitopes (2) and (3). Clearly, every ellitope is a convex compact set symmetric w.r.t. the origin; a basic ellitope, in addition, has a nonempty interior.

Example 1.
A. Bounded intersection X of K centered at the origin ellipsoids/elliptic cylinders {x ∈ R n : x T T k x ≤ 1} [T k 0] is a basic ellitope: In particular, the unit box {x ∈ R n : x ∞ ≤ 1} is a basic ellitope. B. A · p -ball in R n with p ∈ [2, ∞] is a basic ellitope: Ellitopes admit fully algorithmic "calculus": this family is closed with respect to basic operations preserving convexity and symmetry w.r.t. the origin, e.g., taking finite intersections, linear images, inverse images under linear embedding, direct products, arithmetic summation (for details, see [21,Section 4.6]); what is missing, is taking convex hulls of finite unions.

Bounding maximum of quadratic form over an ellitope
The starting point of what follows is the problem of maximizing a homogeneous quadratic form over a convex compact set X ⊂ R n . It is well known that basically the only generic case when the problem is easy is the one where X is an ellipsoid. It is shown in [21] that when X is an ellitope, (4) admits reasonably tight efficiently computable upper bound. Specifically, when X is given by (3), λ ∈ R k + is such that P T CP k λ k T k and x ∈ X , one has for some t ∈ T implying the validity of the implication and thus-the first claim of the following Theorem 2 ([21, Proposition 4.6]). Given ellitope (3) and a matrix C ∈ S n , consider the quadratic maximization problem (4) along with its relaxation The problem is computationally tractable and solvable, and Opt(C) is an efficiently computable upper bound on Opt * (C). This upper bound is reasonably tight: To the best of our knowledge, the first result of this type was established in [33] for X which is an intersection of K concentric elliptic cylinders/ellipsoids; in this case, (4) becomes a special case of quadratically constrained quadratic optimization problem, and (5) is the standard Shor's semidefinite relaxation (see, e.g., [6,Section 4.3]) of this problem. In [33] it is shown that the ratio Opt(C)/ Opt * (C) indeed can be as large as O(ln(K)), even when all T k = a k a T k are of rank 1 and X is the polytope {x : |a T k x| ≤ 1, k ≤ K}.

Bounding operator norms
As stated in Introduction, one of the subjects of this paper is tight efficiently computable upper-bounding of the operator norm of a linear mapping x→Ax : R n → R m induced by norms · X and · B on the argument and the destination spaces, with · U standing for the norm with unit ball U. Our approach works for the case when X and the polar B * of B are ellitopes with nonempty interiors: with T k , T , R , R as required in the definition of a basic ellitope. Under the assumptions just introduced, A B,X is the maximum of a quadratic form over a basic ellitope Z × W: In this case relaxation (5) provides efficiently computable upper bound on A B,X . Immediate computation taking into account the direct product structure of the ellitope Z × W and bilinearity of the quadratic form we are maximizing over this ellitope shows that this bound is Note that Opt(A) clearly is a convex function of A, and Theorem 2 implies that Our main goal is to demonstrate that the latter bound can be refined.

Theorem 3.
In the case of (6) one has Remark 4. Results of [34,35] imply that in some cases the tightness factor ς in (8) can be improved to an absolute constant. Specifically, 1. In the case of (6) with diagonal matrices T k and R , it follows from [35,Theorem 13.2.1] that one can take ς = π 4−π ≈ 3.660 2. When · X = · p , · B = · r with ∞ ≥ p ≥ 2, 1 ≤ r ≤ 2 (this is a special case of 1)), Nesterov [34,35] proved that the upper bound on A p→r := max x p ≤1 Ax r (this bound coincides with Opt(A) when X is the ellitope {x : x p ≤ 1}, and B * is the ellitope {v : v r r−1 ≤ 1}) is tight within (even better than in 1)) factor π 2 √ 3−2π/3 ≈ 2.2936 in the entire range p ∈ [2, ∞], r ∈ [1, 2], factor π/2 ≈ 1.2533 when p = 2 and r ∈ [1,2]. 3 3 Using the identity A B,X = A T X * ,B * , where X * is the polar of X (as is immediately seen, this identity is respected by our bounding scheme), we see that Opt(A) is within π/2 from A p→r when p ≥ 2 and r = 2.
Needless to say, when p = r = 2, the tightness factor is 1. In addition, it is shown in [41] that in the range ∞ ≥ p ≥ 2, 1 ≤ r ≤ 2 bound (9) is exactly equal to the corresponding norm of A for entrywise nonnegative matrices.
Note that there is a simple case when Opt(A) = A B,X -the one where A is a row vector, B = [−1, 1] ⊂ R, and, therefore, Our bounding is intelligent enough to recognize this situation. Indeed, in the case in question (7) reads To put this immediate observation into a proper perspective, see Section 3.2.
The just outlined results are stronger than what in the case in question is stated by Theorem 3. This being said, it can be proved that in the full scope of the latter theorem, logarithmic growth of the tightness factor with K, L is unavoidable.

On the scope of Theorem 3
The scope of Theorem 3-the set of the matrix norms to which the theorem applies-is restricted to the case when the norm in the argument space is simple ellitopic norm, meaning that its unit ball is an ellitope, and the norm on the image space is a simple co-ellitopic norm, meaning that the polar of its unit ball is an ellitope. Clearly, simple co-ellitopic norms (s.co-e.n's) are exactly the conjugates of simple ellitopic norms (s.e.n.'s). These classes of norms allow for certain "calculus" stating that some standard operations with norms preserve their ellitopic/co-ellitopic type.
Basic calculus of simple ellitopic norms is as follows. E.1. (raw materials) When p ∈ [2, ∞], · p is a s.e.n. on R n , E.2. (taking finite maxima) When · (k) , k ≤ K, are s.e.n.'s on R n , so is their maximum. E.3. (restriction to a linear subspace) When · is a s.e.n. on R n and y → Ay : R n → R n is a linear embedding, y := Ay is a s.e.n. on R n E.4. (passing to factor-norm) When · is a s.e.n. on R n and x → Ax : R n → R n is an onto mapping, the factor-norm y = min x { x : Ax = y} is s.e.n. on R n E.5. ("aggregation") Let · (k) be s.e.n. on R n k , k ≤ K, and let A be a monotone convex compact set with a nonempty interior in R K + . Then the norm on R n1 × · · · × R n K with the unit ball  [1,2], · r is a s.co-e.n. on R n (cf. E.1) cE.2. [taking sums] When · (k) , k ≤ K, are s.co-e.n.'s on R n , so is their sum.
Indeed, the unit ball B of the sum of norms with polars B * i of the unit balls is that is, the polar B * of B is the image of B * 1 × . . . B * K under a linear mapping. When all B * k are ellitopes, so is their direct product, and therefore-its linear image B * . Thus, the polar of B is an ellitope, as claimed. cE.3. [restriction to a linear subspace] When · is a s.co-e.n. on R n and y → Ay : R n → R n is a linear embedding, y := Ay is a s.co-e.n. on R n Indeed, assuming that the polar B * of the unit ball of · is an ellitope, we have y = max z∈B * z T Ay. That is, the polar of the unit ball of · is the linear image A T B * of B * , which is an ellitope along with B * . cE.4. [passing to factor-norm] When · is a s.co-e.n. on R n and x → Ax : R n → R n is an onto mapping, the factor-norm y = min x { x : Ax = y} is s.co-e.n. on R n . Indeed, when the polar B * of the unit ball of · is an ellitope, when denoting by A † the pseudoinverse of the onto mapping A, one has Thus, the polar of the unit ball of · is a linear image of the intersection of ellitope B * with a linear subspace, and as such is an ellitope. cE.5. ["aggregation"] Let · (k) be s.co-e.n. on R n k , k ≤ K, and let A be a monotone convex compact set with a nonempty interior in R K + . Then the norm on R n1 × · · · × R n K given by is s.co-e.n. For instance, when r k ∈ [1, 2] and r ∈ [1,2], the norm on R n1 × · · · × R n K given by [x 1 ; . . . ; x K ] = [ x 1 r1 ; . . . ; x K r K ] r is s.co-e.n. Indeed, let · * (k) be the s.e.n.'s conjugate to · (k) . Setting Hence, as is immediately seen, the polar B * of B is that is, · * is s.e.n. by E.5.

An extension
The above results can be straightforwardly extended from the case when B * and X are ellitopes onto a more general case. Specifically, assume that A. X ⊂ R n is a set with nonempty interior represented as the convex hull of a finite union of ellitopes, or, which is the same, where X i ⊂ R ni are basic ellitopes and · Xi are s.e.n. on R ni with unit balls X i .
Under Assumption A, X is a convex compact symmetric w.r.t. the origin subset of R n with 0 ∈ int X ; as such, X is the unit ball of a norm · X . In the sequel we refer to the norms of this structure as to ellitopic norms. Clearly, every simple ellitopic norm is ellitopic, e.g., the block ∞ norm on the space R n1 × · · · × R n I is s.e.n. (by E.1 and E.5). In fact, the family of ellitopic norms is much wider that the family of s.e.n.'s. For example, E.1. When x i (i) are ellitopic norms on R ni , i ≤ I, the associated block 1 on R n1 × · · · × R n I is ellitopic. Indeed, the unit ball X i of · (i) is a convex subset of R ni of the form Conv Ii ν=1 P iν X iν with basic ellitopes X iν . Specifying linear mappings P i from R ni to R n1 × · · · × R n I as the natural embeddings the unit ball X of norm (11) clearly is Conv i≤I,νi≤Ii P i P iνi X iνi . Because, in addition, this set has a nonempty interior, (11) is an ellitopic norm. Note that the property to be ellitopic is inherited when passing to factor-norms (cf. E.4): E.2. When · is an ellitopic norm and y → Ay : R n → R n is an onto mapping, the factor-norm x = min y { y : Ay = x} on R n induced by · and A is ellitopic. Indeed, if the unit ball X of · is given by (10) then the unit ball X of · is the convex compact set with a nonempty interior given by

E.3.
Let · (χ) be an ellitopic norm on R nχ , χ = 1, 2. Then the norm [x 1 ; and X i,1 × X i,2 are basic ellitopes along with X i,1 , X i,2 . By E.3, if · (i) are ellitopic norms on R ni , i ≤ I, then the norm [x 1 ; . . . ; x I ] = max i≤I x i (i) on R n1+···+n I is ellitopic as well. Note, however, that the number of ellitopes involved in the description of this norm is the product, over i ≤ I, of the numbers of ellitopes in the description of norms · (i) and thus may explode exponentially fast as I grows. Assume, next, that B. B ⊂ R m is a set with nonempty interior which is the polar of a set of the structure described in A: where Z j ⊂ R mj are basic ellitopes and Q j , Z j are such that B * has a nonempty interior.
Under Assumption B, B is a convex compact symmetric w.r.t. the origin subset of R n with 0 ∈ int B; as such, B is the unit ball of a norm · B . In the sequel, we refer to norms of this structure as co-ellitopic. Clearly, the conjugate of an ellitopic norm is co-ellitopic, and vice versa.
Note that in the case of (12) we have where Z * j is the polar of Z j . Of course, every simple co-ellitopic norm is co-ellitopic. In fact, the family of co-ellitopic norms is much wider than the family of simple co-ellitopic norms due to the following observations: cE.1. Maximum of finitely many co-ellitopic norms is co-ellitopic.
Indeed, if · (k) , k ≤ K, are co-ellitopic norms on R n , their conjugates · * (k) are ellitopic, implying by E.1 that the norm [y 1 ; . . . ; y K ] = k y k * (k) on R Kn is ellitopic, which by E.2 implies that the factor-norm is ellitopic. The unit ball of the latter norm is the convex compact set and the polar of this set is Thus, the norm max k x k is conjugate to the ellitopic norm · * and as such is co-ellitopic. A closely related statement is of co-ellitopic norms · (k) on R n k is co-ellitopic. Indeed, as we have seen when justifying cE.1, if · * (k) are ellitopic norms conjugate to · (k) , the norm is ellitopic; clearly, norm (14) is conjugate to this ellitopic norm. The second observation is as follows. cE.3. The restriction of a co-ellitopic norm onto a linear subspace is co-ellitopic.
Indeed, we should verify that if x → Ax is an embedding of R n into R n and · is a co-ellitopic norm on R n then the norm x = Ax is co-ellitopic. This is immediate-by the standard properties of norms, under the circumstances, the norm conjugate to x is the factor-norm min y y * : A T y = x induced by the conjugate to · norm · * on R n . This conjugate is an ellitopic norm on R n , and it remains to use E.2. cE.4. The sum of two co-ellitopic norms on R n is co-ellitopic.

Simple observation
Let · X and · B be norms with X given by (10) and B given by (12). Then the operator norm of A∈ R m×n induced by the norms · X and · B on the argument and image spaces can be computed as follows, see (13): Note that by the same token Xi , so that in the case of (10), (12) it holds As we know from Theorem 3, we can upper-bound Q T j AP i ij by Φ ij (Q T j AP i ) with convex and efficiently computable function Φ ij ( · ), the bound being tight within the factor ς(K i , L j ) ≤ 3 ln(4K i ) ln(4L j ), where K i and L j are the ellitopic sizes of X i and Z j . As a result, the efficiently computable convex function is an upper bound on A B,X tight within the factor 3 ln(4 max i K i ) ln(4 max j L j ). In some simple situations the above tightness factor can be improved. For example, when by Nesterov's results (cf. Remark 4), the tightness factor is an absolute constant (e.g., 1 in the trivial case where p i = q j = 2 for all i, j).

Least norm projector synthesis
Consider the projection problem as follows: we are given a linear subspace F of linear space E = R n and a norm θ( · ) on E; our goal is to find a linear projector H of E onto F-a linear map x → Hx : E → F with Hx = x for all x ∈ F-which deviates the least from the identity mapping Id in the norm Consider the case when the norm in question is the block ∞ / 2 norm What makes the projection problem potentially difficult is the block ∞ structure of θ; were ν k = 1 for all k, · θ→θ would have polyhedral epigraph, and minimization of Id −H θ→θ would be a Linear Programming problem (note that property of H to project onto F reduces to a system of linear equalities on H). 4 In contrast, in the general ∞ / 2 case as described above, the problem is NP hard. At the same time, the problem is within the scope of our machinery: the unit ball of θ is the ellitope and therefore θ is a simple ellitopic norm. At the same time, we have As we know, · ∞/2 is co-ellitopic (see cE.2 in Section 3.2) and this property is preserved under restriction of a norm on a linear subspace (cE.3), and it remains to recall that G is an embedding. The bottom line is that we can process the projection problem as explained in Section 3.2. It is immediately seen that the corresponding recipe, under the circumstances, boils down to the following: We select a linear basis {g i : i ≤ n} in E in such a way that the first m = dim F of these vectors form a basis of F; in the sequel, we identify vectors from E with collections of their coordinates in this basis, and linear mappings from E to E with their matrices in this basis. Note that the (matrices of) projectors of E onto F are exactly block-matrices Im P with m × (n − m) blocks P . Applying Theorem 3, we arrive at the efficiently solvable convex optimization problem which is a safe tractable approximation of the problem of interest-the P -component of a feasible solution to the problem specifies projector of E onto F with the value of · θ→θ not exceeding the value of the objective at this solution. This approximation is tight within the factor O(1) ln(4K), meaning that Opt is at most by this factor greater than the actual optimal value in the projection problem. In addition, when ν k = 1 ∀ k, the tightness factor is exactly 1.

Illustration: projecting splines
Consider a partition of [0, 1] into M "large" segments, which are further partitioned into total of N "small" segments. Let also Γ be equidistant grid on [0, 1] with L points. Given nonnegative integers µ L ≥ ν L , µ S ≥ µ L , and ν S ≤ ν L , let us define F as the linear space of restrictions on Γ of splines which are polynomials of degree at most µ L in every large segment, with all derivatives of order ≤ ν L continuous on the entire [0, 1]. We define E as the linear space of restrictions on Γ of splines which are polynomials of degree of order ≤ µ S in every small segment and have continuous on [0, 1] derivatives of order ≤ ν S . With the above inequalities between µ's and ν's, F is a subspace in E. Now let ∆ 1 , . . . , ∆ K be partitioning of Γ into K consecutive segments, and let θ be the ∞ / 2 norm on E given by In Figure 1 we present a sample pair of a spline from E and its projection onto F. In this experiment, |Γ| = 128, there are eight identical large and small segments (separated by red/blue vertical lines on the plots), and K = 16 (on the plots, 16 segments ∆ k are separated from each other by green vertical lines). Splines from E are continuous on [0, 1] and are polynomials of degree 3 on large/small segments, and F is cut off E by additional requirement for the spline to be continuously differentiable on [0, 1]. Solving (17) yields H with Id −H θ→θ ≤ Opt ≈ 1.255 and H is "essentially different" from the · 2 -orthogonal projection 5 H of E onto F-the spectral norm of H − H is ≈ 0.69, and the upper bound on Id −H θ→θ , as given by our machinery, is ≈ 1.527. In fact both upper bounds ≈ 1.255 on Id −H θ→θ and ≈ 1.527 on Id −H θ→θ happen to coincide within four significant digits with the quantities themselves. 6

Synthesis of linear controller with peak-to-peak design specifications
The situation we are about to address is as follows. We control a discrete time linear system and y t ∈ R ny are, respectively, states, controls, external disturbances, and observable outputs. When augmented with non-anticipating affine controller G t τ y t−τ the closed loop system specifies affine mappings With "smart parameterizations" of the controller-passing from {g t , G t τ , 0 ≤ t < N, 0 ≤ τ ≤ t} to the parameters of the affine purified-output-based controller, matrices X N d ,. . . ,Y N become affine functions of the vector χ of controller's parameters; this vector runs through certain finite-dimensional linear space C equipped with filtration C 0 ⊂ C 1 ⊂ · · · ⊂ C N −1 = C by linear subspaces, with C d comprised of "controllers with memory d". We refer the reader to [22] for details of the controller construction.
When designing a controller, one of natural design specifications (traditionally considered as not so easy to handle, cf., e.g., [1,4,5,10,15] and reference therein) are bounds on "peak-to-peak" gains. The disturbance-to-state gain is nothing but the norm of the matrix X N induced by the norm on the space of sequences d N of disturbances and the norm x t r on the space of sequences of states; disturbance-to-control and disturbance-to-output peak-to-peak gains are defined similarly. When ∞ ≥ p ≥ 2 and 1 ≤ r ≤ 2, we can enforce the desired bound on the peak-to-peak gain (which can be difficult to handle, since the corresponding norm of X N d is, in general, difficult to compute) by bounding from above the efficiently computable upper bound, yielded by our machinery, on the gain. As a result, we get an efficiently tractable convex constraint on the parameters of the controller which safely (and tightly within the factor π/2, see the concluding comments in Section 3.2) approximates the design specification in question.
Note that our machinery remains applicable when · p and · r are replaced with, respectively, an s.e.n. · (d) and a s.co-e.n. · (x) , and also when the design specifications impose bound on the "restricted" peak-topeak gains, e.g., on the peak-to-peak disturbance-to-state gain when the disturbances d N are restricted to reside in a given linear subspace of the "complete disturbance space" R n d N .
The first numerical experiment we are about to present 7 deals with minimizing disturbance-to-state ∞/2 peak-to-peak gain (i.e., p = r = 2) when controlling linearized and discretized in time motion of Boeing 747; the model we use originates from [9], see also Section 4.3.2 below. We omit irrelevant for our purposes details (which can be found in [22]), here it suffices to mention that the model is time-invariant (matrices A t ≡ A,. . . ,E t ≡ E) with n x = 4 and n u = n d = n y = 2. Applying our machinery on time horizon N = 256 to build a purified-output-based linear controller with memory depth (whatever it means) 16, we end up with controller with disturbance-to-state peak-to-peak gain ≈ 1.02. To put this result into proper perspective, note that the matrix A of the model in question is only marginally stable (the corresponding spectral radius is 0.9995). As a result, although trivial-identically zero-control results in uniformly bounded in N peak-to-peak gain, this gain (≈ 12) is more than 10 times larger than the gain of the computed controller. Sample trajectories of the system with and without control are presented in Figure 2.
In the reported experiments, d t 2 ≡ 1 for all t. "Bad" disturbance is selected to result in large peak-to-peak gain with vanishing control; in this case max t x t 2 turns to be ≈ 12, while with the control yielded by our synthesis, the same disturbances result in max t x t 2 ≈ 0.9, which is close to the upper bound on the gain (≈ 1.02) guaranteed by our synthesis.
Our second experiment deals with toy dynamical system (3-dimensional states, scalar controls, disturbances, and outputs) considered for illustrative purposes in [11]; what we are interested in, are two scalar "performance outputs" z where a x χ , . . . , a d χ are given, and x t , y t , u t , d t are, respectively, states, outputs, controls, and disturbances at time t (for detailed description, see [11,Section 5]). Following [11], we consider two infinite-horizon gains: γ (1) -the norm of the induced by the linear controller in question mapping Z (1) from the space of bounded sequences {d t , t ≥ 0} equipped with ∞ norm to the space of sequences {z (1) t , t ≥ 0} equipped with the same norm, and γ (2) -the norm of mapping Z (2) from the space of square-summable sequences {d t , t ≥ 0} to the space of sequences {z (2) t , t ≥ 0}, both spaces equipped with 2 norm. The goal is to build linear controller resulting in as small as possible value of γ (1) given an upper bound on γ (2) . To process the problem, we select the "training horizon" T < ∞ (we used T = 120) and memory parameter m (m = 40 in our experiments), and then optimize over the parameters of purified-output-based (POB) controller of memory depth m the "finite-horizon" approximations γ (χ) T of γ (χ) , χ = 1, 2, defined as the respective operator norms of T × T angular blocks in Z (χ) . After the control is synthesized, we estimate the "actual" values of γ (χ) 's by computing the respective norms γ (χ) T of T × T 7 MATLAB code for this experiment utilizing CVX [17] and MOSEK solver [2] is available at https://github.com/ai1-fr/ approximation-of-matrix-norms/tree/main/boeing.

Figure 2
In blue, from top to bottom: state (nx = 4), output (ny = 2) and control (nu = 2, on the synthesized control plots) trajectories of the controlled plant. In the left pane: random harmonic oscillation disturbance, in the right pane: "bad disturbance". In green: · 2-norms of states, outputs and controls, respectively. angular submatrices of the resulting matrices Z χ for T T . 8 Authors of [11] report on performance of 4 linear controllers obtained by 4 different synthesis methods. In order to make the comparison more informative, we used our machinery to design 4 POB controllers with upper bounds on γ (2) close to the corresponding values given by designs in [11]. The benchmark results from [11] and our results are summarized in Table 1. Our tentative conclusion is that, as far as the reported data are concerned, our synthesis compares reasonably well to those discussed in [11].

4
Bounding robust norms of uncertain matrices

Motivation
Consider the following problem which arises, e.g., in Robust Control: Given box-type uncertainty set in the space of m × n matrices, upper-bound the quantity where | · | stands for the spectral norm of a matrix. This problem can be immediately reduced to the Matrix Cube problem (cf. [7], see also [6, Section 3.4.3.1]): associating with m × n matrix A symmetric (m + n) × (m + n) matrix is equivalent to According to the results of [7], reproduced in [6, Theorem 3.4.7], an efficiently verifiable sufficient condition for the validity of the latter semi-infinite Linear Matrix Inequality (LMI) is the solvability of the parametric system of LMIs in matrix variables U s , and this sufficient condition is tight within factor ϑ( · ) depending solely of the maximum of ranks 2 rank(A s ) of the "edge matrices" L[A s ]. Specifically, when setting µ = max 1≤s≤S rank(A s ), we obtain: The goal of this section is to extend this result onto more general matrix norms considered in Section 3.

Problem setting and main result
Let ellitopes X ⊂ R n , B * ⊂ R m with nonempty interior and basic ellitopes W, Z be given by (6), let B be the polar of B * , and let A s ∈ R m×n , 1 ≤ s ≤ S. These data define the uncertain matrix with box uncertainty which we refer to as robust · B,X -norm of uncertain matrix A. Note that this norm is difficult to compute already in the case of "general position" symmetric matrices A s of rank 2. Our goal is to conceive a computationally efficient upper-bounding of the robust norm. Let us consider the quantity and function ϑ of the positive integer argument note that ϑ(k) satisfies (19) [7]. Let also Proposition 5. In the situation of this section, assuming that ranks of all A s are ≤ κ, the efficiently computable quantity Opt as given by (22) is a reasonably tight upper bound on the robust norm A B,X of uncertain matrix A, specifically, where K and L are given by (6).

Remark 6. Assume that matrices
are affine in some vector χ of control parameters. In this case, the quantity A B,X and its efficiently computable upper bound Opt become functions Opt * (χ) = A B,X and Opt(χ) of χ, and it is immediately seen that both functions are convex. As a result, we can handle, to some extent, the problem of minimizing over χ the robust · -norm of uncertain matrix as given by (22) Indeed, the optimization problem specifying Opt clearly is solvable; let λ , υ, {G s , H s } be its optimal solution. Looking at the problem, we see, first, that Opt > 0 implies λ = 0 and υ = 0, and thus φ R (υ) > 0 and φ T (λ) > 0. Furthermore, whenever θ > 0, the collection θ −1 λ, θυ, {θG s , θ −1 H s } is a feasible solution with the value of the objective θφ R (υ) + θ −1 φ T (λ). Since the solution we have started with is optimal, we have This inequality holds true for all θ > 0, which with positive φ R (λ) and φ T (λ) is possible if and only if φ R (υ) = φ T (λ) = Opt /2. It follows that setting λ = 2λ/ Opt, υ = 2υ/ Opt, G s = 2G s / Opt, H s = 2H s / Opt, ρ = 1/ Opt, we get a feasible solution to (24) with the value of the objective 1/ Opt, implying that the left hand side in (24) is ≤ the right hand side. On the other hand, the optimization problem in (24) clearly is solvable. If ρ, λ, υ, {G s , H s } is an optimal solution to (24) then clearly form a feasible solution to the problem specifying Opt, and the value of the objective of the latter problem at this solution is ≤ 1/ρ. Thus, Opt ≤ 1/ρ, ρ being the optimal value of the optimization problem in (24), so that the left hand side in (24) is ≥ the right hand side.

An extension
Similarly to what was done in Section 3.2, the above results can be straightforwardly extended to the case when · X is ellitopic, and · B is co-ellitopic norm. Specifically, for an uncertain matrix the robust norm of A in the case of (10), (12) is where X i and polars Z j of Z * j are ellitopes, and we know how to efficiently upper-bound the robust norms { s s Q T j A s P j : ∞ ≤ 1} Z * j ,Xi and how tight such bounds are.

Putting things together
So far, we have considered separately computationally efficient bounding of operator norms of matrices and robust norms of uncertain matrices with box uncertainty. In applications to follow, we will be interested in a "mixed" setting, where we want to upper-bound the robust norm The corresponding blend of our preceding results is as follows: Proposition 8. Let X ⊂ R n , B, B * ⊂ R m be given by (10), (12), with basic ellitopes Then the efficiently computable quantity is an efficiently computable convex in (A nom , A 1 , . . . , A S ) upper bound on U B,X . This upper bound is reasonably tight, specifically, setting where κ is the maximum of ranks of A s , 1 ≤ s ≤ S, and ς(K, L) and κ( · ), ϑ( · ) are as defined in Theorem 3 and Proposition 5.
Note that "extreme cases" (A s = 0 for all s, on one hand, and A nom = 0, on the other) of Proposition 8 recover Theorem 3 and Proposition 5, and even their "advanced" versions with simple ellitopic/co-ellitopic norms extended to ellitopic/co-ellitopic ones.

4.3Application to robust signal recovery
Consider the standard Signal Processing problem as follows. Given noisy observations of unknown signal x known to belong to a given signal set X ⊂ R n , we want to recover Bx ∈ R ν . Here A ∈ R m×n and B ∈ R ν×n are given matrices. We consider linear recovery x = x H (ω) := H T ω, H ∈ R m×ν and quantify the performance of a candidate estimate x H by its worst-case risk where · B is a given norm on R ν . There is an extensive literature dealing with the design and performance analysis of linear estimates. In particular, it is known [21,Proposition 4.16] that when X is an ellitope of ellitopic size K and the polar B * of the unit ball B of · B is an ellitope of ellitopic size L, the linear estimate x H * yielded by the optimal solution to an explicit efficiently solvable convex optimization problem is optimal within logarithmic in K, L factor: here Risk Opt · B [X ] is the minimax risk-the infimum of risks Risk · B [ x|X ] over all estimates x, linear and nonlinear alike. The result we have just cited, as well as most of known to us results on performance of linear estimates, deals with the case when the sensing matrix A is known in advance. Here we want to address the case when A is subject to "uncertain-but-bounded" perturbations, specifically, is selected (by nature or by an adversary) from the uncertainty set This problem can be seen as a "noninterval" extension of the problem of solving systems of equations affected by interval uncertainty which has received significant attention in the literature, cf., e.g., [14,18,23,31,36,37,38] and references therein. Assuming that given perturbation in A and "true" signal x, the observation noise ξ is  N (0, I m ), the worst-case risk of a linear estimate x H becomes

System identification: the problem
Consider situation as follows: a linear time-invariant dynamical system with states u t ∈ R d and inputs r t ∈ R h evolves according to We are given noisy observations u t of the states and of the inputs on time horizon 0 ≤ t ≤ N : we also have at our disposal upper bounds on the magnitudes of observation errors: |ξ tj | ≤ ξ tj with known ξ's. In addition, we have partial a priori knowledge of X expressed by a system of linear equations on the entries of X. Our goal is to recover the image X + of X under a given linear mapping.
Observe that the considered setting is rather different from the "classical" setting of linear system identification problem, cf. [3,16,27,29,40], in which it is assumed that the states of the system are observed without errors, and the errors in observations of inputs are corrupted by random zero mean noise. The situation in which perturbations in the observation of the state of the system are uncertain-but-bounded (e.g., belong to an ellipsoid) is the subject of the significant literature (see, e.g., [8,12,13,19,24,25,26,28,30,32,39,43] and references therein). The "generic" approach to the problem we develop below, to the best of our knowledge, differs significantly from those proposed so far, and, we believe, can be considered as a meaningful contribution to the this line of research.
Assigning the entries of X serial indices, denoting by ι(i, j) the index of X ij and setting x * ι(i,j) = X ij , we get n-dimensional vector x * , n = d(d + h), known to satisfy the system of linear equations (P ∈ R ν×n has linearly independent rows), expressing our a priori knowledge of the actual entries of X. Dynamic equations read which we rewrite as a system of linear equations on x * of the form so that Π is an orthoprojector of R n onto L and x is the orthogonal projection of x * onto the orthogonal complement of L, we have Thus, x * = x + ∆ * , where ∆ * solves, for properly selected vector = * ∈ R S , * ∞ ≤ 1, the system of linear equations Recall that out goal is to recover from observation the image of X under a given linear mapping; this is the same as to recover for a given ν × n matrix B. Let us quantify the recovery error by the norm · B on R ν .

Robust linear recovery
Given m × n matrix E and m × ν matrix H, let us recover ∆ * by the vector and x * -by the vector x + ∆, δ * by the vector and y * -by y + δ.

Performance analysis
Thus, ∆ ∈ L, ∆ * ∈ L, and where the concluding equality is due to ∆ * = Π∆ * and Π 2 = Π. Besides this, Now let X be the unit ball of a norm · X on R n ; assume that this norm is both ellitopic and co-ellitopic. Let and let Υ 0 [E] and Υ[E] be the efficiently computable convex in E upper bounds, given by our machinery, on the robust norms Assume from now on that · B is a co-ellitopic norm, let Assume now that E is such that Υ[E] < 1. Then As a result,

Synthesis of the linear estimate
Recall that the problem of minimizing Υ[E] w.r.t. E is efficiently solvable. If we are lucky to have Υ * := inf E Υ[E] < 1, we can optimize, to some extent, our estimate H T q of y * = Bx * in H. To this end, let us select E which "nearly minimizes" the quantity over E under the constraint Υ(E) < 1; after E is selected, we specify H by minimizing the resulting right hand side of (39b), that is, ΓΥ(H) + Υ 0 [H] in H. "Near-minimization" of Γ over E can be carried out as follows. Let us select somehow β < 1 close to 1 (e.g., β = 0.9 or β = 0.99) and set Υ i = (1 − β i ) + β i Υ * , i = 0, 1, 2, . . . , so that β i 1−Υ ≤ 1 1−Υ * is equivalent to Υ ≤ Υ i . We solve one by one feasible convex optimization problems we run this process until the quantities Opt i start to grow, and specify Γ as the smallest of Opt i we have generated.
Let us write explicitly the problem (P i ) in the situation where (40) where B m is the unit · 2 -ball in R m and P k ∈ R n×n k . As we know, in this case Exploiting the fact that in our present situation X * = {x : P T k x 2 ≤ 1, k ≤ K} is an ellitope, (P i ) may be rewritten as follows (cf. (26) in Proposition 8): and

Approximation of matrix norms
Remark 9. Rationale behind restricting ourselves to X as in (40) is as follows. Recall that the norm · X we consider is assumed to be both ellitopic and co-ellitopic. There are only two known to us generic situations in which the corresponding unit ball X is both ellitopic and co-ellitopic at the same time, and (40) is one of them. The other nice situation which is "symmetric" to the first, is when · X is the conjugate of the norm just defined, that is, norm of the form max k≤K P T k x 2 . In our context, this second case reduces to the first due to A X ,X = A T X * ,X * .

Numerical illustration
to follow deals with recovery of the parameters of the "Boeing 747" model used in Section 3.3.3, which in our present notation reads where u t ∈ R 4 are the states, and r t ∈ R 4 are the inputs ("in reality" the first two entries in r t are controls, and the last two-external disturbances). We observe u t 's for 0 ≤ t ≤ N = 12 and r t 's for 0 ≤ t < N ; in the resulting identification problem, m = 48, n = 32, S = 100, and L = R n (whence Π = I n and x = 0). Observations of states and inputs are corrupted by "relative -noises", so that an observable real r and its observation r satisfy |r − r| ≤ max[|r|, 1]. In an experiment, we select a noise level ∈ [0.001, 0.01], generate a sample trajectory of the system by selecting at random the initial state and the inputs, then add to the states and the inputs random -errors, and apply to the resulting observations the above robust linear recovery with B = I n and B = X being the unit · 2 -ball in R n to recover the parameters of the system. We compare this recovery with the simplest Least Squares estimation The results of a series of 10 experiments are presented in Table 2. 9 In Figure 3, we present the trajectories of the actual and the recovered (experiment # 10, = 0.01) systems on time horizon 1 ≤ t ≤ 49 for random initial state and inputs (different from those used in the experiment).

A.1 Proof of Theorem 3
The below proof follows that of Theorem 2 as given in [21,Section 4.8.2], utilizing at some point bilinearity of the quadratic form we want to upper-bound on Z × W. Let q, p be the dimensions of the embedding spaces of Z and W, and assume w.l.o.g. that q ≤ p. 10

1°. Let
T = cl{[t; τ ] : τ > 0, t/τ ∈ T } and R = cl{[r; θ] : θ > 0, r/θ ∈ R} be the closed conic hulls of T and R, so that T and R are regular (closed, pointed and convex with nonempty interior) cones such that As is immediately seen, the cones dual to T, R are In view of these observations, (7) is nothing but the conic problem It is easily seen that this problem is strictly feasible and bounded. By Conic Duality, where σ i ( · ), i ≤ q, are the singular values of q × p matrix (recall that q ≤ p). At the last two steps of the above derivation, we have used the following well known facts and the maximum of Frobenius inner products of a given matrix with matrices of spectral norm not exceeding 1 is the nuclear norm of the matrix-the sum of singular values.

2°.
The concluding optimization problem in the above chain clearly is solvable; let U, V, r, t be the optimal solution, and let Let 1 , . . . , p be independent random variables taking values ±1 with probabilities 1/2, and let Then in view of (42) it holds, identically in i = ±1, 1 ≤ i ≤ p: On the other hand, setting E = [e 1 , . . . , e q ], we get an orthonormal q × q matrix such that ξ = E , where = [ 1 ; . . . ; q ] is a Rademacher vector (i.e., random vector with independent entries taking values ±1 with probabilities 1/2), and By Lemma 10, whenever r > 0 we have As a result, for every such that r > 0 we have The latter relation holds true for those for which r = 0 as well, since for these one has U 1/2 R U 1/2 = 0 because trace of the latter positive semidefinite matrix is ≤ r . Similar reasoning with = [ 1 ; . . . ; p ] in the role of and T k , t k in the roles of R , r demonstrates that for every k we have Consequently, invoking (43), we conclude that there exists realization (ξ, η) of (ξ, η) such that Setting v = QU 1/2 ξ, x = P V 1/2 η and invoking (6), we get x X ≤ 3 ln(4K), v B * ≤ 3 ln(4L), resulting in that is, as claimed.
3°. It remains to consider the case of K = L = 1. By evident scaling argument, the situation reduces to that where X = P {w : w T T w ≤ 1} and B * = Q{z : z T Sz ≤ 1}. In this case, On the other hand,

A.2 Proof of Proposition 5
1°. Let R, T, R * and T * be as defined in item 1°of the proof of Theorem 3. Observe that Opt = min λ,υ,Gs,Hs,α,β    α + β : is the singular spectrum of A; the last equality in the chain follows from the two simple observations (cf. the proof of Theorem 3): LMI P Q Q T R 0 with p × p matrix P and r × r matrix R takes place if and only if P 0, R 0, and With L[B] = where λ(A) is the vector of eigenvalues of a symmetric matrix A.
Note that Opt as defined in (44) clearly is a convex function of [A 1 , . . . , A S ].
Observe that A B,X ≤ Opt. Indeed, the problem specifying Opt clearly is solvable, and if λ ≥ 0, υ ≥ 0, {G s , H s } is its optimal solution, we have for all z ∈ Z, w ∈ W, s = ±1 : for all w ∈ W, z ∈ Z, and all s = ±1, implying that A B,X ≤ Opt (recall that P W = X and QZ = B * ).
where ϑ(k) is defined in (21). It follows that for [η; ξ] ∼ N (0, Diag{Y, X}), Now, let π( · ) be the norm on R p with the unit ball W, and ρ( · ) be the norm on R q with the unit ball Z.
Taking into account that X = P W and B * = QZ we conclude that thus arriving at 3°. It remains to invoke and ω ∼ N (0, W ). Denoting by ρ( · ) the norm on R d with the unit ball V, we have where κ( · ) is as in (22).
The statement of the proposition now follows from (45) by applying Lemma 11 to V = W, W = X, and to V = Z, W = Y .

4°.
To complete the proof, it remains to prove Lemma 11. Let us start with the case of J = 1. Setting r = max{r : r ∈ R} and R = R 1 /r, we have Tr(W R) ≤ 1 and ρ(u) = R 1/2 u 2 . Setting W = R 1/2 W R 1/2 and ω = R 1/2 ω, we get ω ∼ N (0, W ), Tr(W ) ≤ 1, and As a result, Under the premise of the lemma, let W 0 be such that Tr(W R j ) ≤ r j for all j. For every j such that r j > 0, setting Θ j = W 1/2 R j W 1/2 /r j , we get Θ j 0, Tr(Θ j ) ≤ 1, so that by the above for all s > 0 and 0 ≤ t < 1/2 The resulting inequality clearly holds true for j with r j = 0 as well. Now, when ω and s > 0 are such that ω T R j ω ≤ s 2 r j for all j, we have ρ(ω) ≤ s. Combining our observations, we get implying that Optimizing w.r.t. t, we arrive at

A.3 Proof of Proposition 8
0°. Equalities in (27) follow from (15), (16). Consequently, all we need is to prove that for all i, j it holds 1°. Let us fix i ≤ I, j ≤ J, and let (22) with Z * j in the role of B and X i in the role of X with the definition of Opt ij in (26), we see that Opt ij is nothing but the upper bound, as given by Proposition 5, on U ij Z * j ,Xi , implying the left inequality in (46). 2°. Observe that the upper bound on α := A 0 Z * j ,Xi as given by Theorem 3, is nothing but and by this Theorem, Next, the upper bound on β : An example of a "genuine" basic spectratope is the unit | · |-ball, | · | being the spectral norm on R p×q : Same as ellitopes, spectratopes admit fully algorithmic "calculus", and their family is closed with respect to basic operations preserving convexity and symmetry w.r.t. the origin, such as taking finite intersections, linear images, inverse images under linear embedding, direct products, arithmetic summation (see [21,Section 4.6] for details); what is missing, is taking convex hulls of finite unions.

B.1.2 Bounding maximum of quadratic form over a spectratope
Given a linear mapping and Finally, same as above, for a convex compact set T , is the support function of T . Given a spectratope an efficiently computable upper bound Opt(C) on the quantity can be built as follows. Assume that Λ = {Λ k ∈ S d k + , k ≤ K} is such that When x ∈ X , there exists w ∈ R q and t ∈ T such that (see (48)) On the other hand, by (49) we have so that due to (51) and (52). As a result, the efficiently computable convex function is an upper bound on Opt(C). It is known ( [21,Proposition 4.8]) that this bound is reasonably tight:

B.2Bounding operator norms, spectratopic case
Similarly to the ellitopic case, our current problem of interest is tight computationally efficient upper-bounding of the norm in the case when X and B * are spectratopes: In this case the efficiently computable upper bound on A B,X and its tightness are given by the following result (which is an improvement of the just cited result from [21]): Theorem 12. In the case of (53) the efficiently computable convex function of A given by is a reasonably tight upper bound on A B,X : Proof. 1°. The left inequality in (55) is evident. Let us prove the right inequality. Let q, p be the dimensions of the embedding spaces of Z and W, and assume that q ≤ p, which is w.l.o.g. for the same reasons as in the ellitopic case. Same as in the latter case, (54) is nothing but the conic problem with the same cones T, R and their duals T * , R * as in the ellitopic case. Same as in that case, the latter problem is strictly feasible and bounded, and by Conic Duality one has (cf. item 1°in the "ellitopic proof").

B.3 Bounding robust norms of uncertain matrices, spectratopic case
Let spectratopes X ⊂ R n , B * ⊂ R m with nonempty interiors and the polar B of B * be given by (53). Our goal is to conceive a computationally efficient upper-bounding of the robust norm

B.3.1 Processing the problem
Acting exactly as in the ellitopic case, with the results of Section B.1.2 in the role of their "ellitopic counterparts" from Section 2.2, we conclude that the efficiently computable quantity Opt := min Λ,Υ, {Gs,Hs} -the "spectratopic analog" of (22)-is an upper bound on A B,X such that for properly selected matrices X ∈ S p + , Y ∈ S q + and r ∈ R, t ∈ T one has R + [Y ] r I g , ≤ L, & T + k [X] t k I d k , k ≤ K, and for the norms π( · ) and ρ( · ) with unit balls W and Z, respectively, and [η; ξ] ∼ N (0, Diag{Y, X}), where κ is the maximum of ranks of A s and ϑ( · ) is given by (21) (cf. (45)).
We have the following spectratopic analog of Lemma 11.
11 Noncommutative Khintchine's inequality due to Lust-Piquard, Pisier, and Buchholz, see [42,Theorem 4.6.1], states that if Q i ∈ S n , 1 ≤ i ≤ I, and ξ i , i = 1, . . . , I, are independent Rademacher or N (0, 1) random variables, then for all t ≥ 0 one has where | · | is the spectral norm. Applying the lemma to V = W, W = X, and to V = Z, W = Y , we get from (59) the following analog of Proposition 5: Proposition 14. In the situation described in the beginning of this section, assuming that ranks of all A s are ≤ κ, the efficiently computable quantity Opt as given by (58) is a reasonably tight upper bound on the robust norm A B,X of uncertain matrix A, specifically, where κ( · ) is given by (60) and ϑ( · ), as given by (21), satisfies

B.3.2 Putting things together
Results of Proposition 14 (and, as a byproduct, of Theorem 12) can be extended, in exactly the same fashion as in the ellitopic case, to the situation where X and the polar B * of B are convex hulls of finite unions of spectratopes rahter than plain spectratopes, and the uncertain matrix in question is not centered, resulting in the following spectratopic analogy of Proposition 8: Theorem 15. Let U = {A nom + S s=1 s A s : ∞ ≤ 1} be an uncertain m × n matrix, X ⊂ R n , B, B * ⊂ R m be given by with basic spectratopes is an efficiently computable convex in (A nom , A 1 , . . . , A S ) upper bound on U B,X . This upper bound is reasonably tight, specifically, setting we have where κ is the maximum of ranks of A s , 1 ≤ s ≤ S, ς( · ) and κ( · ) are as defined in (55) and (60), and ϑ( · ) is defined by (21) and satisfies (19).