proof of Lemma 1

We seek [math]\displaystyle{ \textbf{u} }[/math] that minimizes

[math]\displaystyle{ -\textbf{u}^T\textbf{a} \; \textrm{ subject } \; \textrm{ to } \; \|\textbf{u}\|^2_2 \leq 1, \; \|\textbf{u}\|_1 \leq c_1 }[/math]

First we rewrite the criterion using Lagrange multipliers

[math]\displaystyle{ -\textbf{u}^T\textbf{a} + \lambda\|\textbf{u}\|^2_2 + \Delta\|\textbf{u}\|_1, }[/math]

and we differentiate, set the derivative to 0 and solve for [math]\displaystyle{ \textbf{u} }[/math]:

[math]\displaystyle{ 0 = -\textbf{a} + 2\lambda\textbf{u} + \Delta\Gamma, }[/math]

where [math]\displaystyle{ \ \Gamma_i = \textrm{sgn}(u_i) }[/math] if [math]\displaystyle{ u_i \neq 0 }[/math]; otherwise [math]\displaystyle{ \Gamma_i \in [-1, 1] }[/math]. The Karush-Kuhn-Tucker conditions for optimality consist of the previous equation, along with [math]\displaystyle{ \lambda(\|\textbf{u}\|^2_2 - 1) = 0 }[/math] and [math]\displaystyle{ \Delta(\|\textbf{u}\|_1 - c_1) = 0 }[/math]. Now if [math]\displaystyle{ \ \lambda \gt 0 }[/math], we have

[math]\displaystyle{ \textbf{u} = \frac{S(\textbf{a},\Delta)}{2\lambda} }[/math]

In general, we have either [math]\displaystyle{ \ \lambda = 0 }[/math] (if this results in a feasible solution) or [math]\displaystyle{ \ \lambda }[/math] must be chosen such that [math]\displaystyle{ \|\textbf{u}\|^2_2 = 1 }[/math]. So we see that

[math]\displaystyle{ \textbf{u} = \frac{S(\textbf{a},\Delta)}{\|S(\textbf{a},\Delta) \|_2} }[/math]

Again by the Karush-Kuhn-Tucker conditions, either [math]\displaystyle{ \ \Delta = 0 }[/math] (if this results in a feasible solution) or [math]\displaystyle{ \ \Delta }[/math] must be chosen such that [math]\displaystyle{ \|\textbf{u}\|_1 = c_1 }[/math]. So, [math]\displaystyle{ \ \Delta = 0 }[/math] if this results in [math]\displaystyle{ \|\textbf{u}\|_1 \leq c_1 }[/math]; otherwise we choose [math]\displaystyle{ \ \Delta }[/math] such that [math]\displaystyle{ \|\textbf{u}\|_1 = c_1 }[/math]. This completes the proof of the Lemma.

The above proof is proof of Lemma 2.2 in <ref name="WTH2009">Daniela M. Witten, Robert Tibshirani, and Trevor Hastie. (2009) "A penalized matrix decomposition, with applications to sparse principal components and canonical correlation analysis". Biostatistics, 10(3):515–534.</ref>

References

proof of Lemma 1

References

Navigation menu

Search