User talk:Dsevern: Difference between revisions

From statwiki
Jump to navigation Jump to search
m (Welcome!)
 
No edit summary
Line 1: Line 1:
'''Welcome to ''Wiki Course Notes''!'''
====Newton-Raphson Method====
We hope you will contribute much and well.
Previously we had derivated the log likelihood function for the logistic function.  
You will probably want to read the [[Help:Contents|help pages]].
Again, welcome and have fun! [[User:WikiSysop|WikiSysop]] 23:12, 26 September 2011 (MDT)
<math>\begin{align} {\ell(\mathbf{\beta\,})} & {} = \sum_{i=1}^n \left(y_i {\mathbf{\beta\,}^T \mathbf{x_i}} - \ln({1+e^{\mathbf{\beta\,}^T \mathbf{x_i}}})\right) \end{align}</math>
 
Our goal is to find the <math>\beta\,</math> that maximizes <math>{\ell(\mathbf{\beta\,})}</math>. We use calculus to do this ie solve <math>{\frac{\partial \ell}{\partial \mathbf{\beta\,}}}=0</math>. To do this we use the famous numerical method of Newton-Raphson.
 
<math>\begin{align} {\frac{\partial \ell}{\partial \mathbf{\beta\,}}}&{} = \sum_{i=1}^n \left(y_i \mathbf{x_i} - \frac{e^{\mathbf{\beta\,}^T \mathbf{x}}}{1+e^{\mathbf{\beta\,}^T \mathbf{x}}} \mathbf{x_i} \right) \\[8pt]   \end{align}</math>
 
The first derivative is typically called the score vector.
 
<math>\begin{align} {\frac{\partial \ell}{\partial \mathbf{\beta\,} \partial \mathbf{\beta\,}^T}}&{} = -\sum_{i=1}^n \left(y_i \mathbf{x_i}\mathbf{x_i}^T (\frac{e^{\mathbf{\beta\,}^T \mathbf{x}}}{1+e^{\mathbf{\beta\,}^T \mathbf{x}}})(\frac{1}{1+e^{\mathbf{\beta\,}^T \mathbf{x}}}) \right) \\[8pt]  \end{align}</math>
 
The negative of the second derivative is typically called the information matrix.

Revision as of 22:27, 11 October 2011

Newton-Raphson Method

Previously we had derivated the log likelihood function for the logistic function.

[math]\displaystyle{ \begin{align} {\ell(\mathbf{\beta\,})} & {} = \sum_{i=1}^n \left(y_i {\mathbf{\beta\,}^T \mathbf{x_i}} - \ln({1+e^{\mathbf{\beta\,}^T \mathbf{x_i}}})\right) \end{align} }[/math]

Our goal is to find the [math]\displaystyle{ \beta\, }[/math] that maximizes [math]\displaystyle{ {\ell(\mathbf{\beta\,})} }[/math]. We use calculus to do this ie solve [math]\displaystyle{ {\frac{\partial \ell}{\partial \mathbf{\beta\,}}}=0 }[/math]. To do this we use the famous numerical method of Newton-Raphson.

[math]\displaystyle{ \begin{align} {\frac{\partial \ell}{\partial \mathbf{\beta\,}}}&{} = \sum_{i=1}^n \left(y_i \mathbf{x_i} - \frac{e^{\mathbf{\beta\,}^T \mathbf{x}}}{1+e^{\mathbf{\beta\,}^T \mathbf{x}}} \mathbf{x_i} \right) \\[8pt] \end{align} }[/math]

The first derivative is typically called the score vector.

[math]\displaystyle{ \begin{align} {\frac{\partial \ell}{\partial \mathbf{\beta\,} \partial \mathbf{\beta\,}^T}}&{} = -\sum_{i=1}^n \left(y_i \mathbf{x_i}\mathbf{x_i}^T (\frac{e^{\mathbf{\beta\,}^T \mathbf{x}}}{1+e^{\mathbf{\beta\,}^T \mathbf{x}}})(\frac{1}{1+e^{\mathbf{\beta\,}^T \mathbf{x}}}) \right) \\[8pt] \end{align} }[/math]

The negative of the second derivative is typically called the information matrix.