Representations of Words and Phrases and their Compositionality
Representations of Words and Phrases and their Compositionality is a popular paper published by the Google team led by Tomas Mikolov in 2013. It is known for its impact in the field of Natural Language Processing and the techniques described below are till in practice today.
Presented by
- F. Jiang
- J. Hu
- Y. Zhang
Introduction
The Skip-gram model is a method of constructing a Word2Vec encoding using a neural network. Individual words are encoded
Skip Gram Model
Hierarchical Softmax
Negative Sampling
Subsampling of Frequent Words
Empirical Results
References
[1] Tomas Mikolov, Ilya Sutskever, Kai Chen, Greg Corrado, Jeffrey Dean (2013). Distributed Representations of Words and Phrases and their Compositionality. arXiv:1310.4546.
[2] McCormick, C. (2017, January 11). Word2Vec Tutorial Part 2 - Negative Sampling. Retrieved from http://www.mccormickml.com