Refining Ky Fan’s Majorization Relation with Linear Programming

Let \(A_1, A_2\) be self-adjoint operators on \(\mathbb ^d\) and fix \(k \in [d]\). Our aim is to find upper bounds on \(s_k(A_1 + A_2)\) that are generally tighter than \(s_k(A_1) + s_k(A_2)\), the upper bound from Fan’s majorization relation Eq. 1.8.

For a \(k\)-dimensional subspace \(W \le \mathbb ^d\) and \(i \in [d]\), we denote the overlaps

$$\begin x^_ := \hbox (P_W \Pi _i (A_1)), \; x^_ := \hbox (P_W \Pi _i(A_2)), \end$$

(3.1)

and define the real vectors \(x_^ := (x^_)_^d, x_^ := (x^_)_^d\), and \(x_W := x^_W \oplus x^_W\). We rewrite \(\hbox (P_W (A_1 + A_2))\) in terms of these vectors.

$$\begin \hbox (P_W (A_1 + A_2))= & \sum _^d \lambda _i (A_1) x^_ + \sum _^d \lambda _i (A_2) x^_ \nonumber \\= & (\lambda (A_1) \oplus \lambda (A_2))^T x_W. \end$$

(3.2)

It can be verified by inspection that \(x_W\) satisfies the linear constraints:

$$\begin \begin&x^_, x^_ \ge 0, \forall \, i \in [d],\\&x^_, x^_ \le 1, \forall \, i \in [d], \\&\sum _^d x^_ = \sum _^d x^_ = k. \end \end$$

(3.3)

We refer to these constraints as the basic constraints. The set of points in \(\mathbb ^\) that satisfy the basic constraints is a bounded polyhedron that we denote by \(\mathcal _0^\). We can use this polyhedron to prove the inequality \(s_k(A_1+A_2) \le s_k(A_1) + s_k(A_2)\) in a rather fancy way. Observe that

$$\begin s_k (A_1 + A_2)&= \max _ \hbox (P_W (A_1 + A_2))\end$$

(3.4)

$$\begin&= \max _ (\lambda (A_1) \oplus \lambda (A_2))^T x_W \le (\lambda (A_1) \oplus \lambda (A_2))^T \bar, \end$$

(3.5)

where \(\bar \in \mathbb ^\) is an optimal point for the linear program

$$\begin \begin&\max \, (\lambda (A_1) \oplus \lambda (A_2))^T x\\&\, \text \, x \in \mathcal _0^. \end \end$$

(3.6)

A routine calculation shows that \(\texttt_ \oplus \texttt_\) is an optimal point.

To obtain sharper estimates, we need to consider more constraints. To that end, we introduce alignment terms.

Definition 3.1

For each pair \((\ell _1, \ell _2) \in [d] \times [d]\), we define the alignment term

$$\begin \alpha _^ (V_\bullet (A_1), V_\bullet (A_2)) :&= \max _ \hbox (P_W (P_ (A_1) + P_ (A_2))) \end$$

(3.7)

$$\begin&= s_k (P_ (A_1) + P_ (A_2)). \end$$

(3.8)

These \(d^2\) positive numbers are meant to capture the overlaps between the largest eigenvalue spaces of \(A_1\) and the largest eigenvalue spaces of \(A_2\). The flag pair argument \((V_\bullet (A_1), V_\bullet (A_2))\) is included to emphasize the dependence of alignment terms on the assumed spectral decompositions of \(A_1\) and \(A_2\). When it is clear from context, we omit it and write \(\alpha _^\).

For any \(k\)-dimensional subspace \(W \le \mathbb ^d\), \(x_W\) satisfies, by definition, the linear constraint

$$\begin (\texttt_ \oplus \texttt_)^T x_W \le \alpha _^ (V_\bullet (A_1), V_\bullet (A_2)). \end$$

(3.9)

We call this an alignment constraint. We denote the polyhedron of points in \(\mathcal _^\) that satisfy all alignment constraints by \(\mathcal _1^ (V_\bullet (A_1), V_\bullet (A_2))\) or simply \(\mathcal _1^\).

Theorem 3.2

Let \(u_k(A_1, A_2)\) denote the optimal value of the linear program

$$\begin \begin&\max \, (\lambda (A_1) \oplus \lambda (A_2))^T x\\&\, \text \, x \in \mathcal _1^ (V_\bullet (A_1), V_\bullet (A_2)). \end \end$$

(3.10)

Then \(s_k (A_1 + A_2) \le u_k(A_1, A_2)\).

Proof

The bound follows from Fan’s maximum principle Eq. 2.2, the fact that \(x_W \in \mathcal _1^ (V_\bullet (A_1) V_\bullet (A_2))\) for every \(k\)-dimensional subspace \(W\), and the equalities in Eq. 3.2. \(\square \)

Several remarks are in order.

1.

Although \(\alpha _^\) is given by a maximization over \(k\)-dimensional subspaces, we find it is more amenable to analysis than \(s_k(A_1 + A_2)\). After all, the spectrum of a sum of two projectors is closely related to the relative position of their supports (see, for example, Ref. [11]). This is a crucial ingredient in our proof of Prop. 4.3.

2.

If \(\mathcal \subseteq \mathbb ^\) is a polyhedron that satisfies \(\mathcal _1^ (V_\bullet (A_1), V_\bullet (A_2)) \subseteq \mathcal \subseteq \mathcal _0^\), then the optimal value of the linear program

$$\begin \begin&\max \, (\lambda (A_1) \oplus \lambda (A_2))^T x\\&\, \text \, x \in \mathcal . \end \end$$

(3.11)

is an upper bound on \(s_k (A_1 + A_2)\) that is at most \(s_k(A_1) + s_k(A_2)\). For example, \(\mathcal \) could be the polyhedron of points that satisfy the basic constraints and a subset of the alignment constraints. Or \(\mathcal \) could be equal to \(\mathcal _1^ (V_ \bullet (\tilde), V_\bullet (\tilde))\) where \(\tilde, \tilde\) are self-adjoint operators on \(\mathbb ^d\) such that

$$\begin \alpha _^ (V_\bullet (A_1), V_\bullet (A_2)) \le \alpha _^ (V_\bullet (\tilde), V_\bullet (\tilde)) \end$$

(3.12)

holds for each pair \((\ell _1, \ell _2) \in [d] \times [d]\).

3.

The upper bound \(u_k (A_1, A_2)\) is in fact independent of the choice of spectral decompositions of \(A_1\) and \(A_2\). To see this, let us suppose that for some \(t \in [d-1]\), \(\lambda _t(A_1) = \lambda _(A_1)\). Let \(U\) be a unitary operator on \(\mathbb ^d\) that acts as the identity on \(\langle \ (A_1) \} \rangle ^\) but is otherwise arbitrary. Let \(V'_\bullet (A_1)\) denote the flag

$$\begin (0 \le U V_ (A_1) \le U V_ (A_1) \le \cdots \le U V_ (A_1) = \mathbb ^d). \end$$

(3.13)

Notice that for all \(\ell \ne t\), \(U V_ (A_1) = V_ (A_1)\) . This implies that for all \(\ell _1 \ne t\) and \(\ell _2\), it holds that

$$\begin \alpha _^ (V'_\bullet (A_1), V_\bullet (A_2)) = \alpha _^ (V_\bullet (A_1), V_\bullet (A_2)). \end$$

(3.14)

Now, suppose that \(x^ \oplus x^ \in \mathcal _1^ (V_\bullet (A_1), V_\bullet (A_2))\). Let \(x'^\) denote the vector with coordinates

$$\begin x'^_&:= x_t + x_ - \min (1,x_t + x_) \end$$

(3.15)

$$\begin x'^_&:= \min (1, x_t + x_) \end$$

(3.16)

and otherwise \(x'^_i := x^_i\). The objective value at \(x'^ \oplus x^\) equals the objective value at \(x^ \oplus x^\). It can be verified by inspection that \(x'^ \oplus x^\) satisfies the basic constraints. For \(\ell _1 \ne t\) and every \(\ell _2\),

$$\begin (\texttt_ \oplus \texttt_)^T (x'^ \oplus x^)&= (\texttt_ \oplus \texttt_)^T (x^ \oplus x^) \end$$

(3.17)

$$\begin&\le \alpha _^ (V_\bullet (A_1), V_\bullet (A_2)) \end$$

(3.18)

$$\begin&= \alpha _^ (V'_\bullet (A_1), V_\bullet (A_2)). \end$$

(3.19)

As for the alignment constraints associated with pairs of the form \((t, \ell _2)\), we argue in two cases. First, if \(\min (1,x_t + x_) = x_t + x_\), then

$$\begin (\texttt_ \oplus \texttt_)^T (x'^ \oplus x^) = (\texttt_} \oplus \texttt_)^T (x^ \oplus x^). \end$$

(3.20)

If \(t = 1\), then

$$\begin (\texttt_} \oplus \texttt_)^T (x^ \oplus x^)&= \texttt_^T x^ \end$$

(3.21)

$$\begin&\le \min (k, \ell _2) \end$$

(3.22)

$$\begin&\le \alpha _^ (V'_\bullet (A_1), V_\bullet (A_2)). \end$$

(3.23)

Otherwise,

$$\begin (\texttt_} \oplus \texttt_)^T (x^ \oplus x^)&= (\texttt_ \oplus \texttt_)^T (x^ \oplus x^) \end$$

(3.24)

$$\begin&\le \alpha _^ (V_\bullet (A_1), V_\bullet (A_2))\end$$

(3.25)

$$\begin&= \alpha _^ (V'_\bullet (A_1), V_\bullet (A_2)) \end$$

(3.26)

$$\begin&\le \alpha _^ (V'_\bullet (A_1), V_\bullet (A_2)). \end$$

(3.27)

Second, if \(\min (1,x_t + x_) = 1\), then

$$\begin (\texttt_ \oplus \texttt_)^T (x'^ \oplus x^)&= (\texttt_ \oplus \texttt_)^T (x^ \oplus x^) - 1 \end$$

(3.28)

$$\begin&\le \alpha _^ (V_\bullet (A_1), V_\bullet (A_2)) - 1 \end$$

(3.29)

$$\begin&= \alpha _^ (V'_\bullet (A_1), V_\bullet (A_2)) - 1\end$$

(3.30)

$$\begin&\le \alpha _^ (V'_\bullet (A_1), V_\bullet (A_2)). \end$$

(3.31)

Hence, \(x'^ \oplus x^ \in \mathcal _1^ (V'_\bullet (A_1), V_\bullet (A_2))\). These considerations extend to cases where there are other degeneracies in either \(A_1\) or \(A_2\).

3.1 Total Unimodularity of the Constraints

A useful observation about the alignment constraints is that vectors of the form \(\texttt_ \oplus \texttt_\) can be made to have the consecutive ones property by a single basis permutation. Let \(\sigma : [d] \rightarrow [d]\) denote the mirror permutation with the action \(\ell \mapsto d - \ell + 1\) for each \(\ell \in [d]\). Applying this permutation to the first \(d\) elements of the basis of \(\mathbb ^\), corresponding to the first summand, yields

$$\begin \texttt_ \oplus \texttt_\ \mapsto \texttt_ \oplus \texttt_ = (0, \ldots 0, \underbrace_} \mid \underbrace_} 0, \ldots 0). \end$$

(3.32)

The basic constraints are also given by vectors that have the consecutive ones property.

This is significant because a matrix with the consecutive ones property, also known as an interval matrix, is totally unimodular. That means the determinant of every one of its square submatrices equals \(+1, 0,\) or \(-1\). This implies that if the alignment terms are integers, then the extremal points of \(\mathcal _1^\) are integral, meaning their coordinates are integers (see Ch. 19 of Ref. [21]). From the basic constraints, this means that their coordinates are equal to \(0\) or \(1\). Since a submatrix of a totally unimodular matrix is totally unimodular, the same is true for the extremal points of any polyhedron in \(\mathcal _0^\) that is defined by a subset of the alignment constraints.

3.2 The Meaning of Alignment Terms

We move on to discussing the meaning of alignment terms. To start, we observe that Fan’s maximum principle Eq. 2.2 implies the upper bound

$$\begin \alpha _^ \le \min (\ell _1, k) + \min (\ell _2, k), \end$$

(3.33)

with equality if \(P_ (A_1)\) and \(P_ (A_2)\) are perfectly aligned. We observe that Lidskii’s majorization relation (see Sec. III.4 of Ref. [4]) implies the lower bound

$$\begin \alpha _^ \ge \min ( k, \max (0, \ell _1 + \ell _2 - d)) + \min (k, \ell _1 + \ell _2). \end$$

(3.34)

When \(\ell _1 + \ell _2 \le k\), the upper bound and lower bound coincide, implying the identity \(\alpha _^ = \ell _1 + \ell _2\). This says that the cases where some information about the operators \(A_1\) and \(A_2\) is retained are the ones where \(\ell _1 + \ell _2 > k\).

If the upper bound Eq. 3.33 is satisfied with equality, then the alignment constraint corresponding to the pair \((\ell _1, \ell _2)\) is redundant within \(\mathcal _0^\). On the other hand, the slackness statement \(\lceil \alpha _^ \rceil < \min (\ell _1, k) + \min (\ell _2, k)\) implies an upper bound on \(s_k (A_1 + A_2)\) with a combinatorial meaning. In words, \(\lceil \alpha _^ \rceil - k\) is the maximum number of eigenvalue pairs whose indices are in \([\ell _1]\) and \([\ell _2]\), respectively, that can appear (with coefficient \(1\)) together in the objective \((\lambda (A_1) \oplus \lambda (A_2))^T x\). The remaining eigenvalue pairs have to be staggered. Notice that, per the discussion in the previous paragraph, the slackness statement implies \(\ell _1 + \ell _2 > k\).

Proposition 3.3

Let \(m := \lceil \alpha _^ \rceil - k\) and suppose that it satisfies \(m < \min (\ell _1, k) + \min (\ell _2, k) - k\). Denote the tuple of staggered eigenvalue sums

$$\begin \mu := (\lambda _i (A_1) + \lambda _ (A_2))_^k \oplus (\lambda _ (A_1) + \lambda _i(A_2))_^k.\nonumber \\ \end$$

(3.35)

Then \(s_k (A_1 + A_2) \le s_ (A_1) + s_ (A_2) + s_ (\mu )\).

Proof

By Thm. 3.2, it suffices to show that the optimal value of the linear program

$$\begin \begin&\max \, (\lambda (A_1) \oplus \lambda (A_2))^T x\\ &\, \text \, x \in \mathcal _0^, \,(\texttt_ \oplus \texttt_)^T x \le m+k \end \end$$

(3.36)

is \(s_ (A_1) + s_ (A_2) + s_ (\mu )\). By total unimodularity, the extremal points of the feasible set of this program are integral (see Sec. 3.1). Hence, it suffices to maximize \((\lambda (A_1) \oplus \lambda (A_2))^T (\texttt_ \oplus \texttt_)\) over subset pairs \((S_1, S_2)\) that satisfy \(S_1, S_2 \subseteq [d], |S_1| = |S_2|=k\) and \(| S_1 \cap [\ell _1]| + | S_2 \cap [\ell _2]| \le m + k.\)

Given such a \(k\)-set pair \((S_1, S_2)\), we may assume without loss of generality that \(S_1 \cap [\ell _1] = [r_1]\) and \(S_2 \cap [\ell _2] = [r_2]\) where \(r_1 := |S_1 \cap [\ell _1]|\) and \(r_2 := |S_2 \cap [\ell _2]|\). Otherwise, the intersections \(S_1 \cap [\ell _1]\) and \(S_2 \cap [\ell _2]\) may be adjusted to improve the objective value. By the same token, we may assume that \(S_1 \cap [\ell _1]^ = \\) and \(S_2 \cap [\ell _2]^c = \\). Lastly, we argue that the alignment constraint may be assumed to be satisfied with equality. If \(| S_1 \cap [\ell _1]| + | S_2 \cap [\ell _2]| \ne m + k\), then \(S_1\) may be adjusted to improve the objective value by replacing elements in \(S_1 \cap [\ell _1]^\) with elements in \([\ell _1]\) until the constraint is satisfied with equality or \(S_1 \cap [\ell _1]^\) becomes empty. If the latter happens before the former, \(S_2\) may be adjusted to improve the objective value by replacing elements in \(S_2 \cap [\ell _2]^c\) with elements in \([\ell _2]\) until the constraint is satisfied with equality. Hence, without loss of generality, we assume that \(r_1 + r_2 = m + k\). From the fact that \(\max (r_1, r_2) \le k\), we conclude that \(m \le \min (r_1, r_2)\).

With these assumptions on \((S_1, S_2)\), the objective function is

$$\begin (\lambda (A_1) \oplus \lambda (A_2))^T (\texttt_ \oplus \texttt_)&= s_ (A_1) + s_ (A_2)\nonumber \\ &+\sum _^ \lambda _i(A_1) + \sum _^ \lambda _(A_2) \nonumber \\ &+ \sum _^ \lambda _(A_1) + \sum _^ \lambda _i (A_2). \end$$

(3.37)

We rewrite the sums in the second and third lines.

$$\begin \begin&\sum _^ \lambda _i(A_1) + \sum _^ \lambda _(A_2) = \sum _^ \lambda _i(A_1) + \lambda _(A_2), \\&\sum _^ \lambda _(A_1) + \sum _^ \lambda _i (A_2) = \sum _^ \lambda _(A_1) + \lambda _i (A_2) . \end \end$$

(3.38)

Then, by the definition of \(\mu \), it holds that

$$\begin \begin \sum _^ (\lambda _i(A_1) + \lambda _(A_2))&+ \sum _^ (\lambda _(A_1) + \lambda _i (A_2))\\&\le s_ (\mu ). \end \end$$

(3.39)

Equality is realized by maximizing over \(r_1\). \(\square \)

3.3 Simultaneously Diagonalizable Summands

In this subsection, we outline a strategy for proving that the linear programming upper bound given by Thm. 3.2 is tight in the case of simultaneously diagonalizable summands. We complete the proof in App. A.

Let \(D_1\) and \(D_2\) be two simultaneously diagonalizable self-adjoint operators on \(\mathbb ^d\). Without loss of generality, we assume that \(D_1\) and \(D_2\) are diagonal in the standard basis \(\_^d\). Furthermore, we take \(V_\bullet (D_1)\) and \(V_\bullet (D_2)\) to be complete flags corresponding to diagonal spectral decompositions of \(D_1\) and \(D_2\), respectively. For each \(i \in [d]\), we denote \(\tilde_i (D_1) := q_i^* D_1 q_i\) and \(\tilde_i(D_2) := q_i^* D_2 q_i\). To capture the order of the eigenvalues, we define the sets

$$\begin \Omega ^1_\ell&:= \ (D_1) q_i = q_i\}, \Omega ^2_\ell&:= \ (D_2) q_i = q_i\}. \end$$

(3.40)

for each \(\ell \in [d]\). These are the sets of coordinates of the largest \(\ell \) values in \(\tilde (D_1) := (\tilde_i(D_1))_^d\) and \(\tilde (D_2) := (\tilde_i (D_2))_^d\), respectively. We use the two subset chains \(\_\}_^d\) and \(\_\}_^d\) to define two total orders, \(\ge ^1\) and \(\ge ^2\), on \([d]\). For \(x, x' \in [d]\), we write \(x \ge ^1 x'\) if

$$\begin \min \ \ge \min \. \end$$

(3.41)

Similarly, we write \(x \ge ^2 x'\) if

$$\begin \min \ \ge \min \. \end$$

(3.42)

Observe that \(x \ge ^1 x'\) implies \(\tilde_x (D_1) \le \tilde_ (D_1)\) and \(x \ge ^2 x'\) implies \(\tilde_x (D_2) \le \tilde_ (D_2)\).

Now, we compute the alignment terms of \(D_1\) and \(D_2\).

Lemma 3.4

For each \( (\ell _1, \ell _2) \in [d] \times [d]\), it holds that

$$\begin \alpha _^ (V_ (D_1), V_ (D_2)) = \min (k, | \Omega ^_ \cap \Omega ^_ |) + \min (k, |\Omega ^_ \cup \Omega ^_|). \end$$

(3.43)

Proof

Recall that

$$\begin \alpha _^ (V_ (D_1), V_ (D_2))&= \max _ \hbox (P_W ( P_ (D_1) + P_ (D_2))) \end$$

(3.44)

$$\begin&= s_k( P_ (D_1) + P_ (D_2)) \end$$

(3.45)

$$\begin&= s_k( \sum _} q_i q_i^* + \sum _} q_i q_i^*). \end$$

(3.46)

If \(k \le |\Omega ^_ \cap \Omega ^_|\), then it equals \(2k\). If \(|\Omega ^_ \cap \Omega ^_|< k < |\Omega ^_ \cup \Omega ^_|\), then it equals \( 2 |\Omega ^_ \cap \Omega ^_| + k - |\Omega ^_ \cap \Omega ^_| = |\Omega ^_ \cap \Omega ^_| + k\). Lastly, if \(k \ge |\Omega ^_ \cup \Omega ^_|\), then it equals \(|\Omega ^_| + |\Omega ^2_|\) which equals \( |\Omega ^_ \cap \Omega ^_| + |\Omega ^_ \cup \Omega ^_|\). \(\square \)

Since the alignment terms are integers, the coordinates of the extremal points of \(\mathcal _1^ (V_ (D_1), V_ (D_2))\) are in \(\\). Therefore, the linear programming upper bound \(u_k (D_1, D_2)\) equals the optimal value of the following combinatorial optimization problem:

$$\begin \max \; (\tilde (D_1) \oplus \tilde(D_2))^T (\texttt_ \oplus \texttt_) \end$$

(3.47)

over subsets pairs \((S_1, S_2)\) satisfying \(S_1, S_2 \subseteq [d], |S_1| = |S_2| = k\), and for all \((\ell _1, \ell _2) \in [d] \times [d]\),

$$\begin |S_1 \cap \Omega ^1_| + |S_2 \cap \Omega ^2_| \le \min (k, | \Omega ^_ \cap \Omega ^_ |) + \min (k, |\Omega ^_ \cup \Omega ^_|). \end$$

(3.48)

We call a subset pair \((S_1, S_2)\) that satisfies all of these constraints a feasible pair. Notice that the objective here differs from the one in Thm. 3.2 by a relabeling of the coordinates of the linear form.

Next, we observe that \(s_k(D_1 + D_2)\) equals

$$\begin \max _ \sum _ q_i^* (D_1 + D_2) q_i = \max _ \sum _ \tilde_i (D_1) + \tilde_i (D_2) \end$$

(3.49)

which can be rewritten as follows

$$\begin \max _ \; (\tilde (D_1) \oplus \tilde (D_2))^T (\texttt_ \oplus \texttt_). \end$$

(3.50)

We call set pairs of the form \((S, S)\) symmetric. Therefore, to prove that \(u_k (D_1, D_2) = s_k (D_1 + D_2)\), it suffices to show that for an arbitrary feasible pair \((S_1, S_2)\), there exists a feasible symmetric pair \((S, S)\) such that the objective Eq. 3.47 at \((S,S)\) is greater than or equal to the objective at \((S_1, S_2)\).

Theorem 3.5

It holds that

$$\begin u_k (D_1, D_2) = s_k (D_1 + D_2). \end$$

(3.51)

Proof

Deferred to appendix App. A. \(\square \)

Comments (0)

No login
gif