stat441w18/Convolutional Neural Networks for Sentence Classification: Difference between revisions
Line 21: | Line 21: | ||
=== Model Settings === | === Model Settings === | ||
Consider a sentence of length <math> n </math>, represented by <math> \boldsymbol{x}_{1:n} </math>. Let <math> \boldsymbol{x}_i \in \mathbb{R}^k </math> be the <math> i</math>-th word in | Consider a sentence of length <math> n </math>, represented by <math> \boldsymbol{x}_{1:n} </math>. Let <math> \boldsymbol{x}_i \in \mathbb{R}^k </math> be the <math> i</math>-th word in the sentence and <math> \oplus </math> be the concatenation operator, where <math> \boldsymbol{x}_{1:n} = \boldsymbol{x}_{1} \oplus \boldsymbol{x}_2 \oplus \dots \oplus \boldsymbol{x}_n </math>. In general, let <math> \boldsymbol{x}_{i:i+j} </math> represent the concatenation of words <math> \boldsymbol{x}_{i}, \boldsymbol{x}_{i+1}, \dots, \boldsymbol{x}_{i+j} </math>. | ||
<math> \boldsymbol{x}_{1:n} = \boldsymbol{x}_{1} \oplus \boldsymbol{x}_2 \oplus \dots \oplus \boldsymbol{x}_n </math>. | |||
We also consider a filter <math> w \in \mathbb{R}^{hk} </math>. | |||
=== Model Regularization === | === Model Regularization === |
Revision as of 17:05, 4 March 2018
Presented by
1. Ben Schwarz
2. Cameron Miller
3. Hamza Mirza
4. Pavle Mihajlovic
5. Terry Shi
6. Yitian Wu
7. Zekai Shao
Introduction
Model
Model Settings
Consider a sentence of length [math]\displaystyle{ n }[/math], represented by [math]\displaystyle{ \boldsymbol{x}_{1:n} }[/math]. Let [math]\displaystyle{ \boldsymbol{x}_i \in \mathbb{R}^k }[/math] be the [math]\displaystyle{ i }[/math]-th word in the sentence and [math]\displaystyle{ \oplus }[/math] be the concatenation operator, where [math]\displaystyle{ \boldsymbol{x}_{1:n} = \boldsymbol{x}_{1} \oplus \boldsymbol{x}_2 \oplus \dots \oplus \boldsymbol{x}_n }[/math]. In general, let [math]\displaystyle{ \boldsymbol{x}_{i:i+j} }[/math] represent the concatenation of words [math]\displaystyle{ \boldsymbol{x}_{i}, \boldsymbol{x}_{i+1}, \dots, \boldsymbol{x}_{i+j} }[/math].
We also consider a filter [math]\displaystyle{ w \in \mathbb{R}^{hk} }[/math].
Model Regularization
Datasets and Experimental Setup
Hyperparameters and Training
MR:
SST-1:
SST-2:
Subj:
TREC:
CR:
MPQA:
Pre-trained Word Vectors
Model Variations
CNN-rand:
CNN-static:
CNN-static:
CNN-non-static:
CNN-multichannel: