scholarly journals Graph-Based Topic Extraction from Vector Embeddings of Text Documents: Application to a Corpus of News Articles

Author(s):  
M. Tarik Altuncu ◽  
Sophia N. Yaliraki ◽  
Mauricio Barahona
2019 ◽  
Vol 8 (3) ◽  
pp. 6634-6643 ◽  

Opinion mining and sentiment analysis are valuable to extract the useful subjective information out of text documents. Predicting the customer’s opinion on amazon products has several benefits like reducing customer churn, agent monitoring, handling multiple customers, tracking overall customer satisfaction, quick escalations, and upselling opportunities. However, performing sentiment analysis is a challenging task for the researchers in order to find the users sentiments from the large datasets, because of its unstructured nature, slangs, misspells and abbreviations. To address this problem, a new proposed system is developed in this research study. Here, the proposed system comprises of four major phases; data collection, pre-processing, key word extraction, and classification. Initially, the input data were collected from the dataset: amazon customer review. After collecting the data, preprocessing was carried-out for enhancing the quality of collected data. The pre-processing phase comprises of three systems; lemmatization, review spam detection, and removal of stop-words and URLs. Then, an effective topic modelling approach Latent Dirichlet Allocation (LDA) along with modified Possibilistic Fuzzy C-Means (PFCM) was applied to extract the keywords and also helps in identifying the concerned topics. The extracted keywords were classified into three forms (positive, negative and neutral) by applying an effective machine learning classifier: Convolutional Neural Network (CNN). The experimental outcome showed that the proposed system enhanced the accuracy in sentiment analysis up to 6-20% related to the existing systems.


Author(s):  
Laith Mohammad Abualigah ◽  
Essam Said Hanandeh ◽  
Ahamad Tajudin Khader ◽  
Mohammed Abdallh Otair ◽  
Shishir Kumar Shandilya

Background: Considering the increasing volume of text document information on Internet pages, dealing with such a tremendous amount of knowledge becomes totally complex due to its large size. Text clustering is a common optimization problem used to manage a large amount of text information into a subset of comparable and coherent clusters. Aims: This paper presents a novel local clustering technique, namely, β-hill climbing, to solve the problem of the text document clustering through modeling the β-hill climbing technique for partitioning the similar documents into the same cluster. Methods: The β parameter is the primary innovation in β-hill climbing technique. It has been introduced in order to perform a balance between local and global search. Local search methods are successfully applied to solve the problem of the text document clustering such as; k-medoid and kmean techniques. Results: Experiments were conducted on eight benchmark standard text datasets with different characteristics taken from the Laboratory of Computational Intelligence (LABIC). The results proved that the proposed β-hill climbing achieved better results in comparison with the original hill climbing technique in solving the text clustering problem. Conclusion: The performance of the text clustering is useful by adding the β operator to the hill climbing.


2020 ◽  
Vol 87 ◽  
pp. 106002 ◽  
Author(s):  
Ammar Kamal Abasi ◽  
Ahamad Tajudin Khader ◽  
Mohammed Azmi Al-Betar ◽  
Syibrah Naim ◽  
Sharif Naser Makhadmeh ◽  
...  
Keyword(s):  

2021 ◽  
pp. 1-13
Author(s):  
Qingtian Zeng ◽  
Xishi Zhao ◽  
Xiaohui Hu ◽  
Hua Duan ◽  
Zhongying Zhao ◽  
...  

Word embeddings have been successfully applied in many natural language processing tasks due to its their effectiveness. However, the state-of-the-art algorithms for learning word representations from large amounts of text documents ignore emotional information, which is a significant research problem that must be addressed. To solve the above problem, we propose an emotional word embedding (EWE) model for sentiment analysis in this paper. This method first applies pre-trained word vectors to represent document features using two different linear weighting methods. Then, the resulting document vectors are input to a classification model and used to train a text sentiment classifier, which is based on a neural network. In this way, the emotional polarity of the text is propagated into the word vectors. The experimental results on three kinds of real-world data sets demonstrate that the proposed EWE model achieves superior performances on text sentiment prediction, text similarity calculation, and word emotional expression tasks compared to other state-of-the-art models.


2021 ◽  
Vol 11 (5) ◽  
pp. 663
Author(s):  
Elena D. Bazhanova ◽  
Alexander A. Kozlov ◽  
Anastasia V. Litovchenko

Epilepsy is a chronic neurological disorder characterized by recurring spontaneous seizures. Drug resistance appears in 30% of patients and it can lead to premature death, brain damage or a reduced quality of life. The purpose of the study was to analyze the drug resistance mechanisms, especially neuroinflammation, in the epileptogenesis. The information bases of biomedical literature Scopus, PubMed, Google Scholar and SciVerse were used. To obtain full-text documents, electronic resources of PubMed Central and Research Gate were used. The article examines the recent research of the mechanisms of drug resistance in epilepsy and discusses the hypotheses of drug resistance development (genetic, epigenetic, target hypothesis, etc.). Drug-resistant epilepsy is associated with neuroinflammatory, autoimmune and neurodegenerative processes. Neuroinflammation causes immune, pathophysiological, biochemical and psychological consequences. Focal or systemic unregulated inflammatory processes lead to the formation of aberrant neural connections and hyperexcitable neural networks. Inflammatory mediators affect the endothelium of cerebral vessels, destroy contacts between endothelial cells and induce abnormal angiogenesis (the formation of “leaky” vessels), thereby affecting the blood–brain barrier permeability. Thus, the analysis of pro-inflammatory and other components of epileptogenesis can contribute to the further development of the therapeutic treatment of drug-resistant epilepsy.


2008 ◽  
Vol 7 (2) ◽  
pp. 118-132 ◽  
Author(s):  
John Stasko ◽  
Carsten Görg ◽  
Zhicheng Liu

Investigative analysts who work with collections of text documents connect embedded threads of evidence in order to formulate hypotheses about plans and activities of potential interest. As the number of documents and the corresponding number of concepts and entities within the documents grow larger, sense-making processes become more and more difficult for the analysts. We have developed a visual analytic system called Jigsaw that represents documents and their entities visually in order to help analysts examine them more efficiently and develop theories about potential actions more quickly. Jigsaw provides multiple coordinated views of document entities with a special emphasis on visually illustrating connections between entities across the different documents.


Sign in / Sign up

Export Citation Format

Share Document