Difference between revisions of "a Rank Minimization Heuristic with Application to Minimum Order System Approximation"

From statwiki
Jump to: navigation, search
(Minimum Order System Approximation)
 
(51 intermediate revisions by 4 users not shown)
Line 1: Line 1:
=== Introduction ===
+
== Introduction ==
Rank Minimization Problem (RMP) has application in a variety of areas such as control, system identification, statistics and signal processing. Except in some special cases RMP is known to be computationaly hard.  
+
The Rank Minimization Problem (RMP) has applications in a variety of areas including control theory, system identification, statistics and signal processing. Except in some special cases the RMP is known to be computationally hard.  
<math>
+
<br /><br />
 +
<center><math>
 
\begin{array}{ l l }
 
\begin{array}{ l l }
 
\mbox{minimize} & \mbox{Rank } X \\
 
\mbox{minimize} & \mbox{Rank } X \\
\mbox{subject to: } & X \in C
+
\mbox{subject to: } & X \in C,
 
\end{array}
 
\end{array}
</math>
+
</math></center>
 +
 
 +
where <math>\, X \in \mathbb{R}^{m \times n}</math> is the decision variable and <math>\, C </math> is a convex set.
  
If the matrix is symmetric and positive semidifinite, trace minimization is a very effective heuristic for rank minimization problem. The trace minimization results in a semidefinite problem which can be easily solved.
+
If the matrix is symmetric and positive semidefinite, trace minimization is a very effective heuristic for solving the rank minimization problem. Trace minimization results in a semidefinite programming problem which can be easily solved.
<math>
+
<br /><br />
 +
<center><math>
 
\begin{array}{ l l }
 
\begin{array}{ l l }
 
\mbox{minimize} & \mbox{Tr } X \\
 
\mbox{minimize} & \mbox{Tr } X \\
 
\mbox{subject to: } & X \in C
 
\mbox{subject to: } & X \in C
 
\end{array}
 
\end{array}
</math>
+
</math></center>
  
This paper focuses on the following problems:
+
This paper<ref name="self">[http://faculty.washington.edu/mfazel/nucnorm_acc_final.pdf A rank minimization heuristic with application to minimum order system approximation, M. Fazel, H. Hindi, and S. Body]</ref> focuses on the following problems:
#Describing a generalization of the trace heuristic for genaral non-square matrices.
+
#Describing a generalization of the trace heuristic for general non-square matrices.
#Showing that the new heuristic can be reduced to an SDP, and hence effictively solved.
+
#Showing that the new heuristic can be reduced to a SDP, and hence effectively solved.
#Applying the mothod on the minimum order system approximation.
+
#Applying the method to the minimum order system approximation problem.
  
=== Nuclear Norm Heuristic A Generalization Of The Trace Heuristic ===
+
== Nuclear Norm Heuristic: A Generalization Of The Trace Heuristic ==
This heurisitic minimizes the sum of the singular values of the matrix <math>X\in \real^{m\times n}</math>, which is the nuclear norm of <math>X</math> denoted by <math>\|X\|_*</math>.
+
Solving the rank minimization problem using the trace norm heuristic is not possible when the matrix under consideration is non-symmetric, or non-square.  In this case, the authors develop a new heuristic.
 +
This heuristic minimizes the sum of the singular values of the matrix <math>X \in \real^{m\times n}</math>, which is known as the ''nuclear norm'' or ''Ky-Fan n-norm'' of <math>\,X</math> denoted by <math>\|X\|_*</math>.
  
<math>
+
<center><math>
 
\begin{array}{ l l }
 
\begin{array}{ l l }
 
\mbox{minimize} & \|X\|_* \\
 
\mbox{minimize} & \|X\|_* \\
 
\mbox{subject to: } & X \in C
 
\mbox{subject to: } & X \in C
 
\end{array}
 
\end{array}
</math>
+
</math></center>
  
According to the definition of the nuclear norm we have <math>\|X\|_*=\sum_{i=1}^{\min\{m,n\} }\sigma_i(X)</math> where <math> \sigma_i(X) = \sqrt{\lambda_i (X^TX)}</math>.
+
According to the definition of the nuclear norm we have <math>\|X\|_*=\sum_{i=1}^{\min\{m,n\} }\sigma_i(X)</math> where <math> \sigma_i(X) = \sqrt{\lambda_i (X^TX)}</math>, the singular values of <math>\,X</math>
  
The nuclear norm is dual of the spectrial norm  
+
The nuclear norm is the dual of the spectral norm  
<math>\|X\|_* =\sup \{ \mbox{Tr } Y^T X | \|Y\| \leq 1 \}</math>. So the relaxed version of the rank minimization problem is a convex optimization problem.
+
<math>\|X\|_* =\sup \{ \mbox{Tr } Y^T X | \|Y\| \leq 1 \},</math> where <math>\|Y\|</math> represents the spectral norm of <math>\,Y</math> defined as the maximum singular value of <math>\,Y</math>. So the relaxed version of the rank minimization problem is a convex optimization problem.It means that although the original problem is in general a difficult optimization problem,the dual spectral norm minimization problem is a convex optimization problem, and therefore (at least in principle) easily solved.
  
When the matrix variable <math>X</math> is symmetric and positive semidefinite, then its singular values are the same as its eigenvalues, and therefore the nuclear norm reduces to <math>\mbox{Tr } X</math>, and that means the heuristic reduces to the trace minimization heuristic.
+
When the matrix variable <math>\,X</math> is symmetric and positive semidefinite, then its singular values are the same as its eigenvalues, and therefore the nuclear norm reduces to <math>\,\mbox{Tr } X</math>, so the nuclear norm heuristic reduces to the trace minimization heuristic.
  
 +
The authors suggest using the nuclear norm as a heuristic for solving the rank minimization problem because the nuclear norm is the convex envelope of the rank function on the set of matrices with norm less than one and thus the nuclear norm minimization problem can be seen as a relaxation of the rank minimization problem. 
  
 +
[[Image:Convex Envelope.png|thumb|200px|right|convex envelope of a function, borrowed from <ref>[http://faculty.washington.edu/mfazel/acc04-tutorial.pdf Rank Minimization and Applications in System Theory, M. Fazel, H. Hindi, and S. Body]</ref>]]
 +
''' Definition:''' Let <math>f:C \rightarrow\real</math> where <math>C\subseteq \real^n</math>. The convex envelope of <math>\,f</math> (on <math>\,C</math>) is defined as the largest convex function <math>\,g</math> such that <math>g(x)\leq f(x)</math> for all <math>x\in X</math>.
  
[[Image:Convex Envelope.png|thumb|200px|right|convex envelope of a function, borrowed from <ref>Rank Minimization and Applications in System Theory, M. Fazel, H. Hindi, and S. Body</ref>]]
+
'''Theorem 1''' The convex envelope of the function <math>\,\phi(X)=\mbox{Rank }(X)</math>, on <math>C=\{X\in \real^{m\times n} | \|X\|\leq 1\} </math> is <math>\phi_{\textrm{env}}(X) = \|X\|_*</math>.
''' Definition:''' Let <math>f:C \rightarrow\real</math> where <math>C\subseteq \real^n</math>. The convex envelope of <math>f</math> (on <math>C</math>) is defined as the largest convex function <math>g</math> such that <math>g(x)\leq f(x)</math> for all <math>x\in X</math>.
 
  
'''Theorem 1''' The convex envelope of the function <math>\phi(X)=\mbox{Rank }(X)</math>, on <math>C=\{X\in \real^{m\times n} | \|X\|\leq 1\} </math> is <math>\phi_{\mbox{env}}(X) = \|X\|_*</math>.
+
Suppose <math>X\in C</math> is bounded by <math>\,M</math>, that is <math>\|X\|\leq M</math>, then the convex envelope of <math>\,\mbox{Rank }X</math> on <math>\{X | \|X\|\leq M\}</math> is given by <math>\frac{1}{M}\|X\|_*</math>.  
  
 +
Note that this implies that <math>\mbox{Rank } X \geq \frac{1}{M} \|X\|_*</math>. Particularly, that means if <math>\,p_{\mbox{rank}}</math> and <math>\,p_{*}</math> are the optimal values of the rank minimization problem and dual spectrial norm minimization problem then we have
 +
<math>p_{\mbox{rank}}\geq \frac{1}{M} p_{*}</math> and by solving the nuclear norm minimization problem we obtain a lower bound on the optimal value of the rank minimization problem.
  
Suppose <math>X\in C</math> is bounded by <math>M</math> that is <math>\|X\|\leq M</math>, then the convex envelope of <math>\mbox{Rank }X</math> on <math>\{X |  \|X\|\leq M\}</math> is given by <math>\frac{1}{M}\|X\|_*</math>.
+
== Expressing as an SDP Problem ==
  
<math>\mbox{Rank } X \geq \frac{1}{M} \|X\|_*</math>
+
To solve the nuclear norm minimization problem we can convert the problem into a SDP problem.  We begin be transforming the problem into the following form
That means if <math>p_{\mbox{rank}}</math> and <math>p_{*}</math> are the optimal values of the rank minimization problem and dual spectrial norm minimization problem then we have
 
<math>p_{\mbox{rank}}\geq \frac{1}{M} p_{*}</math>
 
  
=== Expressing as an SDP ===
+
<center><math>
 
+
\begin{array}{ l l }
To express the relaxed version as a SDP we need to express the constraints by linear matrix inequalityes (LMIs).
+
\mbox{minimize} & t \\
 
+
\mbox{subject to: } & \|X\|_*\leq t\\
'''Lemma 1''' For <math>X\in \real^{m\times n}</math> and <math>t\in \real</math>, we have <math>R^{m\times m}</math> and <math>Z\in \real^{n\times n}</math> such that
+
& X \in C
 +
\end{array}
 +
</math></center>
  
 +
We can then use the following lemma to transform the constraints into linear matrix inequalities (LMIs).
 +
<br /><br />
 +
'''Lemma 1''' For <math>X\in \real^{m\times n}</math> and <math>t\in \real</math>, we have <math>\|X\|_*\leq t</math> if and only if there exists matrices <math>Y \in \real^{m\times m}</math> and <math>Z\in \real^{n\times n}</math> such that
 +
<br /><br />
 
<math>
 
<math>
 
\left[\begin{array}{cc}
 
\left[\begin{array}{cc}
 
Y & X\\
 
Y & X\\
 
X^T & Z
 
X^T & Z
\end{array}\right]\geq 0, \quad \mbox{Tr}(Y) + \mbox{Tr}(Z) \leq 2t
+
\end{array}\right]\succeq 0, \quad \mbox{Tr}(Y) + \mbox{Tr}(Z) \leq 2t
 
</math>
 
</math>
  
 
+
The nuclear norm minimization problem can now be expressed as
Using this lemma the nuclear norm minimization problem
+
<br /><br />
 
+
<center><math>
<math>
 
 
\begin{array}{ l l }
 
\begin{array}{ l l }
\mbox{minimize} & t \\
+
\mbox{minimize} & \mbox{Tr}(Y) + \mbox{Tr}(Z) \\
\mbox{subject to: } & \|X\|_*\leq t\\
+
\mbox{subject to: } & \left[\begin{array}{cc}
 +
Y & X\\
 +
X^T & Z
 +
\end{array}\right] \succeq 0 \\
 
& X \in C
 
& X \in C
 
\end{array}
 
\end{array}
</math>
+
</math></center>
 +
 
 +
where <math>\,Y=Y^T, Z=Z^T</math> are new variables. If the constraint set <math>\,C</math> can be expressed as linear matrix inequalities then the problem is an SDP, and can be solved using available SDP solvers.
  
can be expressed as
+
== Minimum Order System Approximation ==
  
 +
The rank minimization heuristic can be used in the minimum order system approximation problem.
 +
In system theory, the effect of a system can be modeled using a rational matrix <math>\,H(s)</math>:
 +
<math> H(s) = R_0 +\sum_{i=1}^N \frac{R_i}{s-p_i}</math> where <math>R_i \in \Complex^{m\times n}</math>  and <math>\,p_i</math> are the complex poles of the system with the property that for each complex <math>\,p_i</math> its complex conjugate is also a pole, and whenever <math>p_i = \bar{p_j}</math> we have <math>R_i=\bar{R_j}</math>.
 +
 +
We want to describe the system as simply as possible. That is to say, we are looking for <math>\,H</math> such that
 
<math>
 
<math>
 +
\deg(H) = \sum_{i=1}^N \mbox{Rank}(R_i)
 +
</math>
 +
is minimized.
 +
 +
The transfer matrix is measured for a few frequencies, <math>\omega_1,\dots,\omega_K\in \real</math> with accuracy <math>\,\epsilon > 0</math>.
 +
The matrices <math>\,G_k</math> are given as the measured approximation of <math>\,H(j\omega_k)</math>.Note that <math>H(j\omega_k)</math> is a linear function of the variables <math>R_i</math>.
 +
Therefore we have the following minimization problem:
 +
<br /><br />
 +
<center><math>
 
\begin{array}{ l l }
 
\begin{array}{ l l }
\mbox{minimize} & \mbox{Tr}(Y) + \mbox{Tr}(Z) \\
+
\mbox{minimize} &\sum_{i=1}^N \mbox{Rank}(R_i) \\
\mbox{subject to: } & \left[\begin{array}{cc}
+
\mbox{subject to: } & \|H(j\omega_k)-G_k\| \leq \epsilon, \quad k=1,\dots,K
Y & X\\
 
X^T & Z
 
\end{array}\right] \leq 0 \\
 
& X \in C
 
 
\end{array}
 
\end{array}
</math>
+
</math></center>
  
where <math>Y=Y^T, Z=Z^T</math> are new variables. If the constraint set <math>C</math> can be expressed as linear matrix inequalityes then the problem is an SDP, and can be solved using available SDP solvers.
+
This optimization problem can be expressed as a SDP problem as discussed in the previous section.
  
 +
==Numerical Example==
 +
[[Image:NumExFig2-025.png|thumb|400px|right|The top plot shows the Rank degree obtained by the heuristic. The bottom plot shows the corresponding values of the heuristic objective function. A modified image borrowed from <ref name="self"></ref>]]
  
== Minimum Order System Approximation ==
+
Consider a [http://en.wikibooks.org/wiki/Control_Systems/MIMO_Systems MIMO system] (i.e. multiple inputs and outputs); in particular, we will analyze one with 2 inputs and 2 outputs. We have a normalized transfer matrix <math>F</math> (i.e. <math>\|F\|_{\infty} = 1</math>). Additionally, assume this matrix is of order 8 so that we have poles <math>p_1,\dots,p_8</math> which come in pairs (i.e. complex conjugates). The goal is to reduce the order while minimizing the information lost.
  
The rank minimization heuristic can be used in the minimum order system approximation problem.
+
Solving the SDP representation of the problem discussed in the previous section (<math>\epsilon = 0.05</math>) gives an approximation of order 6. Additionally, the 8th order and 6th order representations are similar and are the same in many cases.
In system theory, the effect of a system can be modeled using a ratioanl matrix <math>H(s)</math>:
 
<math> H(s) = R_0 +\sum_{i=1}^N \frac{R_i}{s-p_i}</math> where <math>R_i \in \complex^{m\times n}</math>  and <math>p_i</math> are the complex poles of the system with the property that for each complex $p_i$ its
 
compelx conjucate is also a pole, and whenever <math>p_i = p_j^*</math> we have <math>R_i=R_j^*</math>
 
  
===References===
+
==References==
 
<references />
 
<references />

Latest revision as of 09:45, 30 August 2017

Introduction

The Rank Minimization Problem (RMP) has applications in a variety of areas including control theory, system identification, statistics and signal processing. Except in some special cases the RMP is known to be computationally hard.

[math] \begin{array}{ l l } \mbox{minimize} & \mbox{Rank } X \\ \mbox{subject to: } & X \in C, \end{array} [/math]

where [math]\, X \in \mathbb{R}^{m \times n}[/math] is the decision variable and [math]\, C [/math] is a convex set.

If the matrix is symmetric and positive semidefinite, trace minimization is a very effective heuristic for solving the rank minimization problem. Trace minimization results in a semidefinite programming problem which can be easily solved.

[math] \begin{array}{ l l } \mbox{minimize} & \mbox{Tr } X \\ \mbox{subject to: } & X \in C \end{array} [/math]

This paper<ref name="self">A rank minimization heuristic with application to minimum order system approximation, M. Fazel, H. Hindi, and S. Body</ref> focuses on the following problems:

  1. Describing a generalization of the trace heuristic for general non-square matrices.
  2. Showing that the new heuristic can be reduced to a SDP, and hence effectively solved.
  3. Applying the method to the minimum order system approximation problem.

Nuclear Norm Heuristic: A Generalization Of The Trace Heuristic

Solving the rank minimization problem using the trace norm heuristic is not possible when the matrix under consideration is non-symmetric, or non-square. In this case, the authors develop a new heuristic. This heuristic minimizes the sum of the singular values of the matrix [math]X \in \real^{m\times n}[/math], which is known as the nuclear norm or Ky-Fan n-norm of [math]\,X[/math] denoted by [math]\|X\|_*[/math].

[math] \begin{array}{ l l } \mbox{minimize} & \|X\|_* \\ \mbox{subject to: } & X \in C \end{array} [/math]

According to the definition of the nuclear norm we have [math]\|X\|_*=\sum_{i=1}^{\min\{m,n\} }\sigma_i(X)[/math] where [math] \sigma_i(X) = \sqrt{\lambda_i (X^TX)}[/math], the singular values of [math]\,X[/math]

The nuclear norm is the dual of the spectral norm [math]\|X\|_* =\sup \{ \mbox{Tr } Y^T X | \|Y\| \leq 1 \},[/math] where [math]\|Y\|[/math] represents the spectral norm of [math]\,Y[/math] defined as the maximum singular value of [math]\,Y[/math]. So the relaxed version of the rank minimization problem is a convex optimization problem.It means that although the original problem is in general a difficult optimization problem,the dual spectral norm minimization problem is a convex optimization problem, and therefore (at least in principle) easily solved.

When the matrix variable [math]\,X[/math] is symmetric and positive semidefinite, then its singular values are the same as its eigenvalues, and therefore the nuclear norm reduces to [math]\,\mbox{Tr } X[/math], so the nuclear norm heuristic reduces to the trace minimization heuristic.

The authors suggest using the nuclear norm as a heuristic for solving the rank minimization problem because the nuclear norm is the convex envelope of the rank function on the set of matrices with norm less than one and thus the nuclear norm minimization problem can be seen as a relaxation of the rank minimization problem.

convex envelope of a function, borrowed from <ref>Rank Minimization and Applications in System Theory, M. Fazel, H. Hindi, and S. Body</ref>

Definition: Let [math]f:C \rightarrow\real[/math] where [math]C\subseteq \real^n[/math]. The convex envelope of [math]\,f[/math] (on [math]\,C[/math]) is defined as the largest convex function [math]\,g[/math] such that [math]g(x)\leq f(x)[/math] for all [math]x\in X[/math].

Theorem 1 The convex envelope of the function [math]\,\phi(X)=\mbox{Rank }(X)[/math], on [math]C=\{X\in \real^{m\times n} | \|X\|\leq 1\} [/math] is [math]\phi_{\textrm{env}}(X) = \|X\|_*[/math].

Suppose [math]X\in C[/math] is bounded by [math]\,M[/math], that is [math]\|X\|\leq M[/math], then the convex envelope of [math]\,\mbox{Rank }X[/math] on [math]\{X | \|X\|\leq M\}[/math] is given by [math]\frac{1}{M}\|X\|_*[/math].

Note that this implies that [math]\mbox{Rank } X \geq \frac{1}{M} \|X\|_*[/math]. Particularly, that means if [math]\,p_{\mbox{rank}}[/math] and [math]\,p_{*}[/math] are the optimal values of the rank minimization problem and dual spectrial norm minimization problem then we have [math]p_{\mbox{rank}}\geq \frac{1}{M} p_{*}[/math] and by solving the nuclear norm minimization problem we obtain a lower bound on the optimal value of the rank minimization problem.

Expressing as an SDP Problem

To solve the nuclear norm minimization problem we can convert the problem into a SDP problem. We begin be transforming the problem into the following form

[math] \begin{array}{ l l } \mbox{minimize} & t \\ \mbox{subject to: } & \|X\|_*\leq t\\ & X \in C \end{array} [/math]

We can then use the following lemma to transform the constraints into linear matrix inequalities (LMIs).

Lemma 1 For [math]X\in \real^{m\times n}[/math] and [math]t\in \real[/math], we have [math]\|X\|_*\leq t[/math] if and only if there exists matrices [math]Y \in \real^{m\times m}[/math] and [math]Z\in \real^{n\times n}[/math] such that

[math] \left[\begin{array}{cc} Y & X\\ X^T & Z \end{array}\right]\succeq 0, \quad \mbox{Tr}(Y) + \mbox{Tr}(Z) \leq 2t [/math]

The nuclear norm minimization problem can now be expressed as

[math] \begin{array}{ l l } \mbox{minimize} & \mbox{Tr}(Y) + \mbox{Tr}(Z) \\ \mbox{subject to: } & \left[\begin{array}{cc} Y & X\\ X^T & Z \end{array}\right] \succeq 0 \\ & X \in C \end{array} [/math]

where [math]\,Y=Y^T, Z=Z^T[/math] are new variables. If the constraint set [math]\,C[/math] can be expressed as linear matrix inequalities then the problem is an SDP, and can be solved using available SDP solvers.

Minimum Order System Approximation

The rank minimization heuristic can be used in the minimum order system approximation problem. In system theory, the effect of a system can be modeled using a rational matrix [math]\,H(s)[/math]: [math] H(s) = R_0 +\sum_{i=1}^N \frac{R_i}{s-p_i}[/math] where [math]R_i \in \Complex^{m\times n}[/math] and [math]\,p_i[/math] are the complex poles of the system with the property that for each complex [math]\,p_i[/math] its complex conjugate is also a pole, and whenever [math]p_i = \bar{p_j}[/math] we have [math]R_i=\bar{R_j}[/math].

We want to describe the system as simply as possible. That is to say, we are looking for [math]\,H[/math] such that [math] \deg(H) = \sum_{i=1}^N \mbox{Rank}(R_i) [/math] is minimized.

The transfer matrix is measured for a few frequencies, [math]\omega_1,\dots,\omega_K\in \real[/math] with accuracy [math]\,\epsilon \gt 0[/math]. The matrices [math]\,G_k[/math] are given as the measured approximation of [math]\,H(j\omega_k)[/math].Note that [math]H(j\omega_k)[/math] is a linear function of the variables [math]R_i[/math]. Therefore we have the following minimization problem:

[math] \begin{array}{ l l } \mbox{minimize} &\sum_{i=1}^N \mbox{Rank}(R_i) \\ \mbox{subject to: } & \|H(j\omega_k)-G_k\| \leq \epsilon, \quad k=1,\dots,K \end{array} [/math]

This optimization problem can be expressed as a SDP problem as discussed in the previous section.

Numerical Example

The top plot shows the Rank degree obtained by the heuristic. The bottom plot shows the corresponding values of the heuristic objective function. A modified image borrowed from <ref name="self"></ref>

Consider a MIMO system (i.e. multiple inputs and outputs); in particular, we will analyze one with 2 inputs and 2 outputs. We have a normalized transfer matrix [math]F[/math] (i.e. [math]\|F\|_{\infty} = 1[/math]). Additionally, assume this matrix is of order 8 so that we have poles [math]p_1,\dots,p_8[/math] which come in pairs (i.e. complex conjugates). The goal is to reduce the order while minimizing the information lost.

Solving the SDP representation of the problem discussed in the previous section ([math]\epsilon = 0.05[/math]) gives an approximation of order 6. Additionally, the 8th order and 6th order representations are similar and are the same in many cases.

References

<references />