Word Embedding: Continuous Space Represengtation for Natural Language

Home > Archive>Volume 29, Issue 1, 2014 >19-29

Word Embedding: Continuous Space Represengtation for Natural Language
DOI:
                        
CSTR:
                        
Author:
                        
Affiliation:
Clc Number:
Fund Project:

Article

Figures

Metrics

Reference

Cited by

Materials

Comments

Abstract:

Word embedding refers to a machine learning technology which maps search of word lying in high-dimensional discrete space (with dimension to be the number of all words) to a real number vector in low-dimensional continuous space. Word embedding provides better semantic word representations, and thus greatly benefits text processing tasks. Meanwhile, huge amount of unlabeled text data, together with the development of advanced machine learning techniques such as deep learning, make it possible to effectively obtain high quality word embeddings. Besides, the definition and practical value of word embedding are given, and some classical methods are also reviewed to obtain word embedding, including neural network based methods, restricted Boltzmann machine based methods, and methods based on factorization of context co-occurrence matrix. For each model, its mathematical definition, physical meaning are introduced in detail, as well as training procedure. In addition, all these methods are compared in the aforementioned three aspects.

Reference

Cited by

Get Citation

Chen Enhong, Qiu Siyu, Xu Chang, Tian Fei, Liu Tieyan. Word Embedding: Continuous Space Represengtation for Natural Language[J].,2014,29(1):19-29.

Copy

Article Metrics

Abstract:
PDF:
HTML:
Cited by:

History

Received:
Revised:
Adopted:
Online: March 14,2014
Published:

For Authors

Get Citation

Related Videos

Share

Article Metrics

History

Article QR Code