User talk:Dsevern: Difference between revisions
m (Welcome!) |
No edit summary |
||
Line 1: | Line 1: | ||
====Newton-Raphson Method==== | |||
Previously we had derivated the log likelihood function for the logistic function. | |||
<math>\begin{align} {\ell(\mathbf{\beta\,})} & {} = \sum_{i=1}^n \left(y_i {\mathbf{\beta\,}^T \mathbf{x_i}} - \ln({1+e^{\mathbf{\beta\,}^T \mathbf{x_i}}})\right) \end{align}</math> | |||
Our goal is to find the <math>\beta\,</math> that maximizes <math>{\ell(\mathbf{\beta\,})}</math>. We use calculus to do this ie solve <math>{\frac{\partial \ell}{\partial \mathbf{\beta\,}}}=0</math>. To do this we use the famous numerical method of Newton-Raphson. | |||
<math>\begin{align} {\frac{\partial \ell}{\partial \mathbf{\beta\,}}}&{} = \sum_{i=1}^n \left(y_i \mathbf{x_i} - \frac{e^{\mathbf{\beta\,}^T \mathbf{x}}}{1+e^{\mathbf{\beta\,}^T \mathbf{x}}} \mathbf{x_i} \right) \\[8pt] \end{align}</math> | |||
The first derivative is typically called the score vector. | |||
<math>\begin{align} {\frac{\partial \ell}{\partial \mathbf{\beta\,} \partial \mathbf{\beta\,}^T}}&{} = -\sum_{i=1}^n \left(y_i \mathbf{x_i}\mathbf{x_i}^T (\frac{e^{\mathbf{\beta\,}^T \mathbf{x}}}{1+e^{\mathbf{\beta\,}^T \mathbf{x}}})(\frac{1}{1+e^{\mathbf{\beta\,}^T \mathbf{x}}}) \right) \\[8pt] \end{align}</math> | |||
The negative of the second derivative is typically called the information matrix. |
Revision as of 22:27, 11 October 2011
Newton-Raphson Method
Previously we had derivated the log likelihood function for the logistic function.
[math]\displaystyle{ \begin{align} {\ell(\mathbf{\beta\,})} & {} = \sum_{i=1}^n \left(y_i {\mathbf{\beta\,}^T \mathbf{x_i}} - \ln({1+e^{\mathbf{\beta\,}^T \mathbf{x_i}}})\right) \end{align} }[/math]
Our goal is to find the [math]\displaystyle{ \beta\, }[/math] that maximizes [math]\displaystyle{ {\ell(\mathbf{\beta\,})} }[/math]. We use calculus to do this ie solve [math]\displaystyle{ {\frac{\partial \ell}{\partial \mathbf{\beta\,}}}=0 }[/math]. To do this we use the famous numerical method of Newton-Raphson.
[math]\displaystyle{ \begin{align} {\frac{\partial \ell}{\partial \mathbf{\beta\,}}}&{} = \sum_{i=1}^n \left(y_i \mathbf{x_i} - \frac{e^{\mathbf{\beta\,}^T \mathbf{x}}}{1+e^{\mathbf{\beta\,}^T \mathbf{x}}} \mathbf{x_i} \right) \\[8pt] \end{align} }[/math]
The first derivative is typically called the score vector.
[math]\displaystyle{ \begin{align} {\frac{\partial \ell}{\partial \mathbf{\beta\,} \partial \mathbf{\beta\,}^T}}&{} = -\sum_{i=1}^n \left(y_i \mathbf{x_i}\mathbf{x_i}^T (\frac{e^{\mathbf{\beta\,}^T \mathbf{x}}}{1+e^{\mathbf{\beta\,}^T \mathbf{x}}})(\frac{1}{1+e^{\mathbf{\beta\,}^T \mathbf{x}}}) \right) \\[8pt] \end{align} }[/math]
The negative of the second derivative is typically called the information matrix.