STAT946F17/ Dance Dance Convolution: Difference between revisions

From statwiki
Jump to navigation Jump to search
Line 1: Line 1:
=Introduction=
=Introduction=


Neural Machine Translation (NMT), which is based on deep neural networks and provides an end- to-end solution to machine translation, uses an '''RNN-based encoder-decoder architecture''' to model the entire translation process. Specifically, an NMT system first reads the source sentence using an encoder to build a "thought" vector, a sequence of numbers that represents the sentence meaning; a decoder, then, processes the "meaning" vector to emit a translation. (Figure 1)<sup>[[#References|[1]]]</sup>
==Background Knowledge==
*NTM
'''Neural Machine Translation (NMT)''', which is based on deep neural networks and provides an end- to-end solution to machine translation, uses an '''RNN-based encoder-decoder architecture''' to model the entire translation process. Specifically, an NMT system first reads the source sentence using an encoder to build a "thought" vector, a sequence of numbers that represents the sentence meaning; a decoder, then, processes the "meaning" vector to emit a translation. (Figure 1)<sup>[[#References|[1]]]</sup>
[[File:VNFigure1.png|thumb|600px|center|Figure 1: Encoder-decoder architecture – example of a general approach for NMT.]]
[[File:VNFigure1.png|thumb|600px|center|Figure 1: Encoder-decoder architecture – example of a general approach for NMT.]]
*Beam Search
Decoding process:
[[File:VNFigure2.png|thumb|600px|center|Figure 2]]
Problem: Choosing the word with highest score at each time step t is not necessarily going to give you the sentence with the highest probability(Figure 2). Beam search solves this problem (Figure 3). Beam search has a size m such that at each time step t, it takes the top m proposal and continues decoding with each one of them. In the end, you will get a sentence with the highest probability not in the word level.
[[File:VNFigure3.png|thumb|600px|center|Figure 3]]


=References=
=References=


1. https://github.com/tensorflow/nmt
1. https://github.com/tensorflow/nmt

Revision as of 14:39, 24 November 2017

Introduction

Background Knowledge

  • NTM

Neural Machine Translation (NMT), which is based on deep neural networks and provides an end- to-end solution to machine translation, uses an RNN-based encoder-decoder architecture to model the entire translation process. Specifically, an NMT system first reads the source sentence using an encoder to build a "thought" vector, a sequence of numbers that represents the sentence meaning; a decoder, then, processes the "meaning" vector to emit a translation. (Figure 1)[1]

Figure 1: Encoder-decoder architecture – example of a general approach for NMT.
  • Beam Search

Decoding process:

Figure 2

Problem: Choosing the word with highest score at each time step t is not necessarily going to give you the sentence with the highest probability(Figure 2). Beam search solves this problem (Figure 3). Beam search has a size m such that at each time step t, it takes the top m proposal and continues decoding with each one of them. In the end, you will get a sentence with the highest probability not in the word level.

Figure 3

References

1. https://github.com/tensorflow/nmt