stat441w18/Convolutional Neural Networks for Sentence Classification: Difference between revisions
Line 21: | Line 21: | ||
=== Theory of Convolutional Neural Networks === | === Theory of Convolutional Neural Networks === | ||
Let <math> \boldsymbol{x}_{i:i+j} </math> be the concatenation of k-dimensional words <math> \boldsymbol{x}_i, \boldsymbol{x}_{i+1}, \dots, \boldsymbol{x}_{i+j} </math>. Then, a sentence of length <math> n </math> is the concatenation of k-dimensional words <math> \boldsymbol{x}_1, \boldsymbol{x}_2, \dots, \boldsymbol{x}_n </math>, represented as <math> \boldsymbol{x}_{1:n} </math>, <math> \boldsymbol{x}_{1:n} = \boldsymbol{x}_1 \oplus \boldsymbol{x}_2 \oplus \dots \oplus \boldsymbol{x}_n </math>, where <math> \oplus </math> is the concatenation operation. | Let <math> \boldsymbol{x}_{i:i+j} </math> be the concatenation of k-dimensional words <math> \boldsymbol{x}_i, \boldsymbol{x}_{i+1}, \dots, \boldsymbol{x}_{i+j} </math>. Then, a sentence of length <math> n </math> is the concatenation of k-dimensional words <math> \boldsymbol{x}_1, \boldsymbol{x}_2, \dots, \boldsymbol{x}_n </math>, represented as <math> \boldsymbol{x}_{1:n} </math>, <math> \boldsymbol{x}_{1:n} = \boldsymbol{x}_1 \oplus \boldsymbol{x}_2 \oplus \dots \oplus \boldsymbol{x}_n </math>, where <math> \oplus </math> is the concatenation operation. Let <math> \boldsymbol{x}_i </math> denote the <math> i </math>-th word in this sentence. | ||
A Convolutional Neural Network (CNN) is a nonlinear function <math> \boldsymbol{f}: \mathbb{R}^{hk} \to \mathbb{R} </math> that computes a series of outputs <math> c_i </math> from a concatenation of words <math> \boldsymbol{x}_i, \boldsymbol{x}_{i+1}, \dots, \boldsymbol{x}_{i+h-1} </math>, represented by <math> \boldsymbol{x}_{i:i+h-1} </math> | A Convolutional Neural Network (CNN) is a nonlinear function <math> \boldsymbol{f}: \mathbb{R}^{hk} \to \mathbb{R} </math> that computes a series of outputs <math> c_i </math> from a concatenation of words <math> \boldsymbol{x}_i, \boldsymbol{x}_{i+1}, \dots, \boldsymbol{x}_{i+h-1} </math>, represented by <math> \boldsymbol{x}_{i:i+h-1} </math> |
Revision as of 18:03, 4 March 2018
Presented by
1. Ben Schwarz
2. Cameron Miller
3. Hamza Mirza
4. Pavle Mihajlovic
5. Terry Shi
6. Yitian Wu
7. Zekai Shao
Introduction
Model
Theory of Convolutional Neural Networks
Let [math]\displaystyle{ \boldsymbol{x}_{i:i+j} }[/math] be the concatenation of k-dimensional words [math]\displaystyle{ \boldsymbol{x}_i, \boldsymbol{x}_{i+1}, \dots, \boldsymbol{x}_{i+j} }[/math]. Then, a sentence of length [math]\displaystyle{ n }[/math] is the concatenation of k-dimensional words [math]\displaystyle{ \boldsymbol{x}_1, \boldsymbol{x}_2, \dots, \boldsymbol{x}_n }[/math], represented as [math]\displaystyle{ \boldsymbol{x}_{1:n} }[/math], [math]\displaystyle{ \boldsymbol{x}_{1:n} = \boldsymbol{x}_1 \oplus \boldsymbol{x}_2 \oplus \dots \oplus \boldsymbol{x}_n }[/math], where [math]\displaystyle{ \oplus }[/math] is the concatenation operation. Let [math]\displaystyle{ \boldsymbol{x}_i }[/math] denote the [math]\displaystyle{ i }[/math]-th word in this sentence.
A Convolutional Neural Network (CNN) is a nonlinear function [math]\displaystyle{ \boldsymbol{f}: \mathbb{R}^{hk} \to \mathbb{R} }[/math] that computes a series of outputs [math]\displaystyle{ c_i }[/math] from a concatenation of words [math]\displaystyle{ \boldsymbol{x}_i, \boldsymbol{x}_{i+1}, \dots, \boldsymbol{x}_{i+h-1} }[/math], represented by [math]\displaystyle{ \boldsymbol{x}_{i:i+h-1} }[/math]
Model Regularization
Datasets and Experimental Setup
Hyperparameters and Training
MR:
SST-1:
SST-2:
Subj:
TREC:
CR:
MPQA:
Pre-trained Word Vectors
Model Variations
CNN-rand:
CNN-static:
CNN-static:
CNN-non-static:
CNN-multichannel: