stat441w18/Convolutional Neural Networks for Sentence Classification: Difference between revisions

From statwiki
Jump to navigation Jump to search
Line 21: Line 21:
=== Model Settings ===
=== Model Settings ===


Consider a sentence of length <math> n </math>, represented by <math> \boldsymbol{x}_{1:n} </math>. Let <math> \boldsymbol{x}_i \in \mathbb{R}^k </math> be the <math> i</math>-th word in this sentence and <math> \oplus </math> be the concatenation operator. Thus,
Consider a sentence of length <math> n </math>, represented by <math> \boldsymbol{x}_{1:n} </math>. Let <math> \boldsymbol{x}_i \in \mathbb{R}^k </math> be the <math> i</math>-th word in the sentence and <math> \oplus </math> be the concatenation operator, where <math> \boldsymbol{x}_{1:n} = \boldsymbol{x}_{1} \oplus \boldsymbol{x}_2 \oplus \dots \oplus \boldsymbol{x}_n </math>. In general, let <math> \boldsymbol{x}_{i:i+j} </math> represent the concatenation of words <math> \boldsymbol{x}_{i}, \boldsymbol{x}_{i+1}, \dots, \boldsymbol{x}_{i+j} </math>.
 
<math> \boldsymbol{x}_{1:n} = \boldsymbol{x}_{1} \oplus \boldsymbol{x}_2 \oplus \dots \oplus \boldsymbol{x}_n </math>.


We also consider a filter <math> w \in \mathbb{R}^{hk} </math>.


=== Model Regularization ===
=== Model Regularization ===

Revision as of 17:05, 4 March 2018

Presented by

1. Ben Schwarz

2. Cameron Miller

3. Hamza Mirza

4. Pavle Mihajlovic

5. Terry Shi

6. Yitian Wu

7. Zekai Shao

Introduction

Model

Model Settings

Consider a sentence of length [math]\displaystyle{ n }[/math], represented by [math]\displaystyle{ \boldsymbol{x}_{1:n} }[/math]. Let [math]\displaystyle{ \boldsymbol{x}_i \in \mathbb{R}^k }[/math] be the [math]\displaystyle{ i }[/math]-th word in the sentence and [math]\displaystyle{ \oplus }[/math] be the concatenation operator, where [math]\displaystyle{ \boldsymbol{x}_{1:n} = \boldsymbol{x}_{1} \oplus \boldsymbol{x}_2 \oplus \dots \oplus \boldsymbol{x}_n }[/math]. In general, let [math]\displaystyle{ \boldsymbol{x}_{i:i+j} }[/math] represent the concatenation of words [math]\displaystyle{ \boldsymbol{x}_{i}, \boldsymbol{x}_{i+1}, \dots, \boldsymbol{x}_{i+j} }[/math].

We also consider a filter [math]\displaystyle{ w \in \mathbb{R}^{hk} }[/math].

Model Regularization

Datasets and Experimental Setup

Hyperparameters and Training

MR:

SST-1:

SST-2:

Subj:

TREC:

CR:

MPQA:

Pre-trained Word Vectors
Model Variations

CNN-rand:

CNN-static:

CNN-static:

CNN-non-static:

CNN-multichannel:

Training and Results

Criticisms

More Formulations/New Concepts

Conclusion

Source