Classification of Semantic Paraphasias: Optimization of a Word Embedding Model

2019 ◽  
Author(s):  
Katy McKinney-Bock ◽  
Steven Bedrick
Keyword(s):  
2018 ◽  
Vol 6 (3) ◽  
pp. 67-78
Author(s):  
Tian Nie ◽  
Yi Ding ◽  
Chen Zhao ◽  
Youchao Lin ◽  
Takehito Utsuro

The background of this article is the issue of how to overview the knowledge of a given query keyword. Especially, the authors focus on concerns of those who search for web pages with a given query keyword. The Web search information needs of a given query keyword is collected through search engine suggests. Given a query keyword, the authors collect up to around 1,000 suggests, while many of them are redundant. They classify redundant search engine suggests based on a topic model. However, one limitation of the topic model based classification of search engine suggests is that the granularity of the topics, i.e., the clusters of search engine suggests, is too coarse. In order to overcome the problem of the coarse-grained classification of search engine suggests, this article further applies the word embedding technique to the webpages used during the training of the topic model, in addition to the text data of the whole Japanese version of Wikipedia. Then, the authors examine the word embedding based similarity between search engines suggests and further classify search engine suggests within a single topic into finer-grained subtopics based on the similarity of word embeddings. Evaluation results prove that the proposed approach performs well in the task of subtopic classification of search engine suggests.


Sentiment Classification is one of the well-known and most popular domain of machine learning and natural language processing. An algorithm is developed to understand the opinion of an entity similar to human beings. This research fining article presents a similar to the mention above. Concept of natural language processing is considered for text representation. Later novel word embedding model is proposed for effective classification of the data. Tf-IDF and Common BoW representation models were considered for representation of text data. Importance of these models are discussed in the respective sections. The proposed is testing using IMDB datasets. 50% training and 50% testing with three random shuffling of the datasets are used for evaluation of the model.


2020 ◽  
Author(s):  
Luiz Fernando Spillere de Souza ◽  
Alexandre Leopoldo Gonçalves

Text classification aims to extract knowledge from unstructured text patterns. The concept of word incorporation is a representation technique that allows words with similar meanings to have a similar representation, in order to incorporate reasoning characteristics about their use and meaning. The aim of this article is to analyze the work already published on the use of embedded words applied to the classification of texts, to propose a practical application that demonstrates its effectiveness. This study contributes to proving the effectiveness of the use of word incorporation applied to text classification, having reached an accuracy rate of around 73%.


Sign in / Sign up

Export Citation Format

Share Document