stat441w18/Convolutional Neural Networks for Sentence Classification

From statwiki
Jump to navigation Jump to search

Presented by

1. Ben Schwarz

2. Cameron Miller

3. Hamza Mirza

4. Pavle Mihajlovic

5. Terry Shi

6. Yitian Wu

7. Zekai Shao

Introduction

Model

Model Settings

Consider a sentence of length [math]\displaystyle{ n }[/math], represented by [math]\displaystyle{ \boldsymbol{x}_{1:n} }[/math]. Let [math]\displaystyle{ \boldsymbol{x}_i \in \mathbb{R}^k }[/math] be the [math]\displaystyle{ i }[/math]-th word in the sentence and [math]\displaystyle{ \oplus }[/math] be the concatenation operator, where [math]\displaystyle{ \boldsymbol{x}_{1:n} = \boldsymbol{x}_{1} \oplus \boldsymbol{x}_2 \oplus \dots \oplus \boldsymbol{x}_n }[/math]. In general, let [math]\displaystyle{ \boldsymbol{x}_{i:i+j} }[/math] represent the concatenation of words [math]\displaystyle{ \boldsymbol{x}_{i}, \boldsymbol{x}_{i+1}, \dots, \boldsymbol{x}_{i+j} }[/math].

We also consider a filter [math]\displaystyle{ w \in \mathbb{R}^{hk} }[/math].

Model Regularization

Datasets and Experimental Setup

Hyperparameters and Training

MR:

SST-1:

SST-2:

Subj:

TREC:

CR:

MPQA:

Pre-trained Word Vectors
Model Variations

CNN-rand:

CNN-static:

CNN-static:

CNN-non-static:

CNN-multichannel:

Training and Results

Criticisms

More Formulations/New Concepts

Conclusion

Source