proof

From statwiki
Revision as of 09:45, 30 August 2017 by Conversion script (talk | contribs) (Conversion script moved page Proof to proof: Converting page titles to lowercase)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search

To prove [math]\displaystyle{ \lambda_{\max}(\sum_{j \in I_k \cup \{i\}} a_ja_j^T) \geq \lambda_{\max}(\sum_{j \in I_k} a_ja_j^T) + (x_k^Ta_i)^2 }[/math] we begin by finding the subgradient of the function [math]\displaystyle{ \lambda_{\max}(\cdot) }[/math] at the matrix [math]\displaystyle{ X \in S_n }[/math]. To do this we must first define the subgradient.

Definition: A matrix [math]\displaystyle{ V \in R^{n \times n} }[/math] is a subgradiant of a convex function [math]\displaystyle{ f: R^{n\times n} \rightarrow R }[/math] at [math]\displaystyle{ X }[/math] if

[math]\displaystyle{ f(Y) \geq f(X) + \left\langle V, Y-X \right\rangle \; \forall \; Y \in \textrm{dom}(f) }[/math]

where [math]\displaystyle{ \left\langle \cdot \right\rangle }[/math] is the inner product.

The function we are interested in is [math]\displaystyle{ \lambda_{\max}(\cdot) }[/math]. In order to define the subgradiant of this function we must first ensure it is convex. If [math]\displaystyle{ X \in S_n }[/math] then

[math]\displaystyle{ \lambda_{\max}(X) = \max_{Y\succeq 0, \textrm{trace}(Y) = 1} \textrm{trace}(YX). }[/math]

Thus [math]\displaystyle{ \,\lambda_{\max}(X) }[/math] is the maximum of a collection of linear functions which implies that [math]\displaystyle{ \lambda_{\max}(\cdot) }[/math] is a convex function.

So, according to our definition of subgradiant we are looking for a matrix [math]\displaystyle{ \,V }[/math] such that

[math]\displaystyle{ \lambda_{\max}(Y) \geq \lambda_{\max}(X) + \left\langle V, Y-X \right\rangle. }[/math]

Since the trace defines an inner product on the space of symmetric matrices we can rewrite this problem as

[math]\displaystyle{ \lambda_{\max}(Y) \geq \lambda_{\max}(X) + \textrm{trace}(V(Y-X)). }[/math]



Suppose [math]\displaystyle{ \,q \in R^n }[/math] is the eigenvector of [math]\displaystyle{ \,X }[/math] associated with the largest eigenvalue and [math]\displaystyle{ \,\|q\|_2 = 1 }[/math]. Since [math]\displaystyle{ \lambda_{\max}(Y) \geq \textrm{trace}(Yqq^T) }[/math] we have

[math]\displaystyle{ \begin{align} \lambda_{\max}(Y) & \geq \textrm{trace}(Yqq^T) = \textrm{trace}(Xqq^T) + \textrm{trace}((Y-X)qq^T) \\ &= \lambda_{\max}(X) + \textrm{trace}({(Y-X)qq^T})\\ &= \lambda_{\max}(X) + \textrm{trace}(q^T(Y-X)q) \end{align} }[/math]


Thus the subgradient is [math]\displaystyle{ \,V = qq^T }[/math] where [math]\displaystyle{ q \in R^n }[/math] is the eigenvector of [math]\displaystyle{ \,X }[/math] associated with the largest eigenvalue and [math]\displaystyle{ \|q\|_2 = 1 }[/math].

Now if we let [math]\displaystyle{ X = \sum_{j \in I_k} a_ja_j^T }[/math] and [math]\displaystyle{ Y = \sum_{j \in I_k \cup \{i\}} a_ja_j^T }[/math] and [math]\displaystyle{ \,x_k }[/math] be the eigenvector of [math]\displaystyle{ \sum_{j \in I_k} a_ja_j^T }[/math] associated with the largest eigenvalue then we have

[math]\displaystyle{ \begin{align} \lambda_{\max}(\sum_{j \in I_k \cup \{i\}} a_ja_j^T) & \geq \lambda_{\max}(\sum_{j \in I_k} a_ja_j^T) + \textrm{trace}(x_i^T(\sum_{j \in I_k \cup \{i\}} a_ja_j^T-\sum_{j \in I_k \cup \{i\}} a_ja_j^T)x_i) \\ & = \lambda_{\max}(\sum_{j \in I_k} a_ja_j^T) + \textrm{trace}(x_k^T(a_ia_i^T)x_i) \\ & = \lambda_{\max}(\sum_{j \in I_k} a_ja_j^T) + (x_k^Ta_i)^2 \end{align} }[/math]

Thus, [math]\displaystyle{ \lambda_{\max}(\sum_{j \in I_k \cup \{i\}} a_ja_j^T) \geq \lambda_{\max}(\sum_{j \in I_k} a_ja_j^T) + (x_k^Ta_i)^2 }[/math].