Revision as of 22:10, 7 March 2018

Method

This paper focuses on representing small text regions which can preserve local internal structural information for specific text classification. It defines [math]\displaystyle{ region\left ( i,c\right ) }[/math] as the [math]\displaystyle{ 2\times c+1 }[/math] length region with middle word [math]\displaystyle{ \omega_i }[/math] which is the i-th word of the document. And then it uses word embeddings and the local context units to produce region embedding. In the following, we first introduce local context unit, then two architectures to generate the region embedding, and how to classify text.

Local context unit

The vocabulary is represented by a matrix [math]\displaystyle{ \mathbf{E}\isin mathbb{R}^{h \times v} }[/math] with a look up layer, denoted by the embedding [math]\displaystyle{ e_\omega }[/math]. The i-th column represents the embedding of [math]\displaystyle{ \omega_i }[/math], denoted by [math]\displaystyle{ \mathbf{e}_\omega_i }[/math].

For each word [math]\displaystyle{ \omega_i }[/math], we define the local context unit [math]\displaystyle{ \mathbf{K}_\omega_i\isin \mathbb{R}^{h\times\left (2c+1\right )} }[/math]. Let [math]\displaystyle{ \mathbf{K}_\{omega_i,t} }[/math] be the (c+t)-th column in [math]\displaystyle{ \mathbf{K}_\omega_i \left (t\isin\left [ -c,c\right ] }[/math], representing a distinctive linear projection function on [math]\displaystyle{ \mathbf(e)_{c+t} }[/math] in the local context [math]\displaystyle{ r\left (i,c\right ) }[/math]. Thus, we can utilize local ordered word information in terms of each word.

Define [math]\displaystyle{ \mathbf{p}_{\omega_i+t}^i }[/math] as the projected word embedding of [math]\displaystyle{ \omega_i+t }[/math] in i-th word’s view, computed by: [math]\displaystyle{ \mathbf{p}_{\omega_i+t}^i = \mathbf{K}_\{omega_i,t} \odot \mathbf{\omega_{i+t}} }[/math] where [math]\displaystyle{ \odot }[/math] denotes an elemet-wise multiplication.

Note local context units and embedding are learned as model parameters. Local context units can be learned to capture the semantic and syntactic influence of each word to its context.

@@ Line 5: / Line 5: @@
 ===Local context unit===
-The vocabulary is represented by a matrix <math> mathbf{E}\isin mathbb{R}^{h \times v} </math> with a look up layer, denoted by the embedding <math> e_\omega </math>. The i-th column represents the embedding of <math> \omega_i </math>, denoted by <math> mathbf{e}_\omega_i</math>.
+The vocabulary is represented by a matrix <math> \mathbf{E}\isin mathbb{R}^{h \times v} </math> with a look up layer, denoted by the embedding <math> e_\omega </math>. The i-th column represents the embedding of <math> \omega_i </math>, denoted by <math> \mathbf{e}_\omega_i</math>.
-For each word <math> \omega_i </math>, we define the local context unit <math> mathbf{K}_\omega_i\isin mathbb{R}^{h\times\left (2c+1\right )}</math>. Let <math> mathbf{K}_\{omega_i,t} </math> be the (c+t)-th column in <math> mathbf{K}_\omega_i \left (t\isin\left [ -c,c\right ]</math>, representing a distinctive linear projection function on <math>mathbf(e)_{c+t}</math> in the local context <math>r\left (i,c\right )</math>. Thus, we can utilize local ordered word information in terms of each word.
+For each word <math> \omega_i </math>, we define the local context unit <math> \mathbf{K}_\omega_i\isin \mathbb{R}^{h\times\left (2c+1\right )}</math>. Let <math> \mathbf{K}_\{omega_i,t} </math> be the (c+t)-th column in <math> \mathbf{K}_\omega_i \left (t\isin\left [ -c,c\right ]</math>, representing a distinctive linear projection function on <math>\mathbf(e)_{c+t}</math> in the local context <math>r\left (i,c\right )</math>. Thus, we can utilize local ordered word information in terms of each word.
-Define <math>mathbf{p}_{\omega_i+t}^i</math> as the projected word embedding of <math> \omega_i+t </math> in i-th word’s view, computed by:
+Define <math>\mathbf{p}_{\omega_i+t}^i</math> as the projected word embedding of <math> \omega_i+t </math> in i-th word’s view, computed by:
-<math> mathbf{p}_{\omega_i+t}^i = mathbf{K}_\{omega_i,t} \odot mathbf{\omega_{i+t}}</math> where <math>\odot</math> denotes an elemet-wise multiplication.
+<math> \mathbf{p}_{\omega_i+t}^i = \mathbf{K}_\{omega_i,t} \odot \mathbf{\omega_{i+t}}</math> where <math>\odot</math> denotes an elemet-wise multiplication.
 Note local context units and embedding are learned as model parameters. Local context units can be learned to capture the semantic and syntactic influence of each word to its context.

stat441w18/A New Method of Region Embedding for Text Classification: Difference between revisions

Revision as of 22:10, 7 March 2018

Method

Local context unit

Navigation menu

stat441w18/A New Method of Region Embedding for Text Classification: Difference between revisions

Revision as of 22:10, 7 March 2018

Method

Local context unit

Navigation menu

Search