stat841: Difference between revisions

From statwiki
Jump to navigation Jump to search
Line 4: Line 4:


'''1. '''<br />
'''1. '''<br />




'''2. Classification'''<br />
'''2. Classification'''<br />
Classification is a function between two discrete random varialbe <math>\,X</math> and <math>\,Y</math>. Given n pairs of data <math>\,(X_{1},Y_{1}), (X_{2},Y_{2}, \dots , (X_{n},Y_{n}))</math>, where <math>\,X_{i}= \{ X_{i1}, X_{i2}, \dots , (X_{id}, \} \in \mathcal{X} \subset \Re^{d}</math>
Classification is a function between two discrete random varialbe <math>\,X</math> and <math>\,Y</math>. Given n pairs of data <math>\,(X_{1},Y_{1}), (X_{2},Y_{2}, \dots , (X_{n},Y_{n}))</math>, where <math>\,X_{i}= \{ X_{i1}, X_{i2}, \dots , (X_{id}, \} \in \mathcal{X} \subset \Re^{d}</math><br />
is a ''d''-dimensional vector and <math>\,Y_{i}</math> takes values in a finite set <math>\, \mathcal{Y} </math>
 


'''3. Error data'''<br />
'''3. Error data'''<br />
Line 14: Line 17:
''Empirical error rate(training error rate)'' of a classifier(h) is defined as the frequency of event that <math>\overline{Y}</math> predicted from <math>\,X</math> by <math>\,h</math> does not equal to <math>\,Y</math> in total n prediction. The mathematical expression is as below:<br />
''Empirical error rate(training error rate)'' of a classifier(h) is defined as the frequency of event that <math>\overline{Y}</math> predicted from <math>\,X</math> by <math>\,h</math> does not equal to <math>\,Y</math> in total n prediction. The mathematical expression is as below:<br />
<math>\, L_{h}= \frac{1}{n} \sum_{i=1}^{n} I(h(X_{i} \neq Y_{i}))</math>, where <math>\,I</math> is an indicator that <math>\, I=</math>.
<math>\, L_{h}= \frac{1}{n} \sum_{i=1}^{n} I(h(X_{i} \neq Y_{i}))</math>, where <math>\,I</math> is an indicator that <math>\, I=</math>.


'''4. Bayes Classifier'''<br />
'''4. Bayes Classifier'''<br />

Revision as of 17:10, 30 September 2009

Scribe sign up

Course Note for Sept.30th (Classfication_by Liang Jiaxi)

1.


2. Classification
Classification is a function between two discrete random varialbe [math]\displaystyle{ \,X }[/math] and [math]\displaystyle{ \,Y }[/math]. Given n pairs of data [math]\displaystyle{ \,(X_{1},Y_{1}), (X_{2},Y_{2}, \dots , (X_{n},Y_{n})) }[/math], where [math]\displaystyle{ \,X_{i}= \{ X_{i1}, X_{i2}, \dots , (X_{id}, \} \in \mathcal{X} \subset \Re^{d} }[/math]
is a d-dimensional vector and [math]\displaystyle{ \,Y_{i} }[/math] takes values in a finite set [math]\displaystyle{ \, \mathcal{Y} }[/math]


3. Error data
Definition:
True error rate of a classifier(h) is defined as the probability that [math]\displaystyle{ \overline{Y} }[/math] predicted from [math]\displaystyle{ \,X }[/math] by classifier [math]\displaystyle{ \,h }[/math] does not actually equal to [math]\displaystyle{ \,Y }[/math], namely, [math]\displaystyle{ \, L(h)=P(h(X) \neq Y) }[/math].
Empirical error rate(training error rate) of a classifier(h) is defined as the frequency of event that [math]\displaystyle{ \overline{Y} }[/math] predicted from [math]\displaystyle{ \,X }[/math] by [math]\displaystyle{ \,h }[/math] does not equal to [math]\displaystyle{ \,Y }[/math] in total n prediction. The mathematical expression is as below:
[math]\displaystyle{ \, L_{h}= \frac{1}{n} \sum_{i=1}^{n} I(h(X_{i} \neq Y_{i})) }[/math], where [math]\displaystyle{ \,I }[/math] is an indicator that [math]\displaystyle{ \, I= }[/math].


4. Bayes Classifier