Revision as of 23:11, 13 October 2017

Introduction

The study of natural language processing has been around for more than fifty years. It begins in 1950s which the specific field of natural language processing (NLP) is still embedded in the subject of linguistics (Hirschberg & Manning, 2015). After the emergence of strong computational power, computational linguistics began to evolve and gradually branch out to various applications of NLP, such as text classification, speech recognition and question answering (https://machinelearningmastery.com/applications-of-deep-learning-for-natural-language-processing/). Computational linguistics or natural language processing is usually defined as “subfield of computer science concerned with using computational techniques to learn, understand, and produce human language content” (Hirschberg & Manning, 2015, p. 261).

Motivation

With the development in deep neural networks, one type of neural network, namely recurrent neural networks (RNN) have preformed significantly well in many natural language processing tasks. The reason why such networks works better compared to others is that the nature of RNN takes into account of the past inputs as well as the current input without resulting in vanishing or exploding gradient. More detail of how RNN works will be discussed in the section of recurrent neural network. However, one limitation of RNN used in NLP is the enormous size of input vocabulary. This will result in a very complex RNN model with too many parameters to train and makes the training process both time and memory-consuming. This is the major motivation for this paper’s authors to develop a new technique utilized in RNN, which is particularly efficient at processing large size of vocabulary in many NLP tasks, namely LightRNN.

Natural Language Processing (NLP) Using Recurrent Neural Network (RNN)

LightRNN Structure

Part I: 2-Component Shared Embedding

Based on the diagram below, the following formulas are used:

Let n = dimension/length of a row input vector, m = dimension/length of a column input vector:

row vector [math]\displaystyle{ x^{r}_{t-1} \in \mathbb{R}^n }[/math]

column vector [math]\displaystyle{ x^{c}_{t-1} \in \mathbb{R}^m }[/math]

[math]\displaystyle{ x^{c}_{t-1} }[/math]

Part II: How 2C Shared Embedding is Used in LightRNN

Part III: Bootstrap for Word Allocation

one
two

Revision as of 18:01, 9 October 2017 (view source) SophiaH (talk \| contribs) No edit summary ← Older edit		Revision as of 23:11, 13 October 2017 (view source) SophiaH (talk \| contribs) (→‎Critique) Newer edit →
Line 33:		Line 33:
	= LightRNN Example =		= LightRNN Example =

	= ~~Critique~~ =		= Remarks =

LightRNN: Memory and Computation-Efficient Recurrent Neural Networks: Difference between revisions

Revision as of 23:11, 13 October 2017

Contents

Introduction

Motivation

Natural Language Processing (NLP) Using Recurrent Neural Network (RNN)

LightRNN Structure

Part I: 2-Component Shared Embedding

Part II: How 2C Shared Embedding is Used in LightRNN

Part III: Bootstrap for Word Allocation

LightRNN Example

Remarks

Navigation menu

LightRNN: Memory and Computation-Efficient Recurrent Neural Networks: Difference between revisions

Revision as of 23:11, 13 October 2017

Introduction

Motivation

Natural Language Processing (NLP) Using Recurrent Neural Network (RNN)

LightRNN Structure

Part I: 2-Component Shared Embedding

Part II: How 2C Shared Embedding is Used in LightRNN

Part III: Bootstrap for Word Allocation

LightRNN Example

Remarks

Navigation menu

Search