proof of Lemma 1
We seek [math]\displaystyle{ \textbf{u} }[/math] that minimizes
First we rewrite the criterion using Lagrange multipliers
and we differentiate, set the derivative to 0 and solve for [math]\displaystyle{ \textbf{u} }[/math]:
where [math]\displaystyle{ \ \Gamma_i = \textrm{sgn}(u_i) }[/math] if [math]\displaystyle{ u_i \neq 0 }[/math]; otherwise [math]\displaystyle{ \Gamma_i \in [-1, 1] }[/math]. The Karush-Kuhn-Tucker conditions for optimality consist of the previous equation, along with [math]\displaystyle{ \lambda(\|\textbf{u}\|^2_2 - 1) = 0 }[/math] and [math]\displaystyle{ \Delta(\|\textbf{u}\|_1 - c_1) = 0 }[/math]. Now if [math]\displaystyle{ \ \lambda \gt 0 }[/math], we have
In general, we have either [math]\displaystyle{ \ \lambda = 0 }[/math] (if this results in a feasible solution) or [math]\displaystyle{ \ \lambda }[/math] must be chosen such that [math]\displaystyle{ \|\textbf{u}\|^2_2 = 1 }[/math]. So we see that
Again by the Karush-Kuhn-Tucker conditions, either [math]\displaystyle{ \ \Delta = 0 }[/math] (if this results in a feasible solution) or [math]\displaystyle{ \ \Delta }[/math] must be chosen such that [math]\displaystyle{ \|\textbf{u}\|_1 = c_1 }[/math]. So, [math]\displaystyle{ \ \Delta = 0 }[/math] if this results in [math]\displaystyle{ \|\textbf{u}\|_1 \leq c_1 }[/math]; otherwise we choose [math]\displaystyle{ \ \Delta }[/math] such that [math]\displaystyle{ \|\textbf{u}\|_1 = c_1 }[/math]. This completes the proof of the Lemma.
The above proof is proof of Lemma 2.2 in <ref name="WTH2009">Daniela M. Witten, Robert Tibshirani, and Trevor Hastie. (2009) "A penalized matrix decomposition, with applications to sparse principal components and canonical correlation analysis". Biostatistics, 10(3):515–534.</ref>
References
<references />