User:Cvmustat: Difference between revisions
Jump to navigation
Jump to search
No edit summary |
|||
Line 22: | Line 22: | ||
'''CNN Pipeline:''' | '''CNN Pipeline:''' | ||
The goal of the CNN pipeline is to learn the relative importance of words in an input sequence based on different aspects. The process of this CNN pipeline is summarized as the following steps: | |||
<ol> | |||
<li> Given a sequence of words, each word is converted into a word vector using the word2vec algorithm which gives matrix X. | |||
</li> | |||
<li> Word vectors are then convolved through the temporal dimension with filters of various sizes (ie. different K) with learnable weights to capture various numerical K-gram representations. These K-gram representations are stored in matrix C. | |||
</li> | |||
<ul> | |||
<li> The convolution makes this process capture local and position-invariant features. Local means the K words are contiguous. Position-invariant means K contiguous words at any position are detected in this case via convolution. | |||
<li> Temporal dimension example: convolve words from 1 to K, then convolve words 2 to K+1, etc | |||
</li> | |||
</ul> | |||
<li> Since not all K-gram representations are equally meaningful, there is a learnable matrix W which takes the linear combination of K-gram representations to more heavily weigh the more important K-gram representations for the classification task. | |||
</li> | |||
<li> Each linear combination of the K-gram representations gives the relative word importance based on the aspect that the linear combination encodes. | |||
</li> | |||
<li> The relative word importance vs aspect gives rise to an interpretable attention matrix A, where each element says the relative importance of a specific word for a specific aspect. | |||
</li> | |||
</ol> | |||
== Merging RNN & CNN Pipeline Outputs == | == Merging RNN & CNN Pipeline Outputs == |
Revision as of 19:02, 8 November 2020
Combine Convolution with Recurrent Networks for Text Classification
Team Members: Bushra Haque, Hayden Jones, Michael Leung, Cristian Mustatea
Date: Week of Nov 23
Introduction
Previous Work
Paper Key Contributions
CRNN Results vs Benchmarks
CRNN Model Architecture
RNN Pipeline:
CNN Pipeline:
The goal of the CNN pipeline is to learn the relative importance of words in an input sequence based on different aspects. The process of this CNN pipeline is summarized as the following steps:
- Given a sequence of words, each word is converted into a word vector using the word2vec algorithm which gives matrix X.
- Word vectors are then convolved through the temporal dimension with filters of various sizes (ie. different K) with learnable weights to capture various numerical K-gram representations. These K-gram representations are stored in matrix C.
- The convolution makes this process capture local and position-invariant features. Local means the K words are contiguous. Position-invariant means K contiguous words at any position are detected in this case via convolution.
- Temporal dimension example: convolve words from 1 to K, then convolve words 2 to K+1, etc
- Since not all K-gram representations are equally meaningful, there is a learnable matrix W which takes the linear combination of K-gram representations to more heavily weigh the more important K-gram representations for the classification task.
- Each linear combination of the K-gram representations gives the relative word importance based on the aspect that the linear combination encodes.
- The relative word importance vs aspect gives rise to an interpretable attention matrix A, where each element says the relative importance of a specific word for a specific aspect.