Word Embedding Based Document Similarity for the Inferring of Penalty

Web Information Systems and Applications - Lecture Notes in Computer Science ◽

10.1007/978-3-030-02934-0_22 ◽

2018 ◽

pp. 240-251 ◽

Author(s):

Tieke He ◽

Hao Lian ◽

Zemin Qin ◽

Zhipeng Zou ◽

Bin Luo

Keyword(s):

Word Embedding ◽

Document Similarity

Download Full-text

Bilingual Document Similarity Calculation Based on Bilingual Word Embedding

Advances in Intelligent Systems and Computing - Recent Developments in Intelligent Computing, Communication and Devices ◽

10.1007/978-981-10-8944-2_95 ◽

2018 ◽

pp. 821-830

Author(s):

Wanjin Che ◽

Zhengtao Yu ◽

Jian Huang ◽

Shengxiang Gao

Keyword(s):

Word Embedding ◽

Document Similarity ◽

Similarity Calculation

Download Full-text

A Document Similarity Computation Method Based on Word Embedding and Citation Analysis

Advances in Intelligent Systems and Computing - Recent Findings in Intelligent Computing Techniques ◽

10.1007/978-981-10-8633-5_17 ◽

2018 ◽

pp. 161-168

Author(s):

K. Lamiya ◽

Anuraj Mohan

Keyword(s):

Citation Analysis ◽

Word Embedding ◽

Computation Method ◽

Document Similarity ◽

Similarity Computation

Download Full-text

Text Genre Detection Using Doc2Vec Word-embedding Language Model

Language and Information ◽

10.29403/li.23.2.2 ◽

2019 ◽

Vol 23 (2) ◽

pp. 23-43

Author(s):

Dongsung Kim

Keyword(s):

Language Model ◽

Word Embedding ◽

Download Full-text

A Simple Word Embedding Model for Lexical Substitution

10.3115/v1/w15-1501 ◽

2015 ◽

Author(s):

Oren Melamud ◽

Omer Levy ◽

Ido Dagan

Keyword(s):

Word Embedding ◽

Lexical Substitution

Download Full-text

Word Embedding Based Knowledge Representation with Extracting Relationship Between Scientific Terminologies

Intelligent Automation & Soft Computing ◽

10.31209/2019.100000135 ◽

2019 ◽

pp. -1--1

Author(s):

Mucheol Kim ◽

Junho Kim ◽

Mincheol Shin

Keyword(s):

Knowledge Representation ◽

Download Full-text

Pembobotan Berdasarkan Tingkat Kesamaan Semantik pada Metode Fuzzy Semi-Supervised Co-Clustering untuk Pengelompokkan Dokumen Teks

Jurnal ULTIMATICS ◽

10.31937/ti.v6i2.333 ◽

2014 ◽

Vol 6 (2) ◽

pp. 46-51

Author(s):

Galang Amanda Dwi P. ◽

Gregorius Edwadr ◽

Agus Zainal Arifin

Keyword(s):

Supervised Learning ◽

Semantic Similarity ◽

The Other ◽

Classification Result ◽

Document Similarity ◽

Index Terms ◽

Membership Value ◽

Degree Of Similarity

Nowadays, a large number of information can not be reached by the reader because of the misclassification of text-based documents. The misclassified data can also make the readers obtain the wrong information. The method which is proposed by this paper is aiming to classify the documents into the correct group. Each document will have a membership value in several different classes. The method will be used to find the degree of similarity between the two documents is the semantic similarity. In fact, there is no document that doesn’t have a relationship with the other but their relationship might be close to 0. This method calculates the similarity between two documents by taking into account the level of similarity of words and their synonyms. After all inter-document similarity values obtained, a matrix will be created. The matrix is then used as a semi-supervised factor. The output of this method is the value of the membership of each document, which must be one of the greatest membership value for each document which indicates where the documents are grouped. Classification result computed by the method shows a good value which is 90 %. Index Terms - Fuzzy co-clustering, Heuristic, Semantica Similiarity, Semi-supervised learning.

Download Full-text

Key phrase Extraction by Improving TextRank with an Integration of Word Embedding and Syntactic Information

Recent Advances in Computer Science and Communications ◽

10.2174/2666255813999200820155846 ◽

2020 ◽

Vol 13 ◽

Author(s):

Sheng Zhang ◽

Qi Luo ◽

Yukun Feng ◽

Ke Ding ◽

Daniela Gifu ◽

...

Keyword(s):

Semantic Information ◽

Performance Enhancement ◽

Word Embedding ◽

The Other ◽

Test Set ◽

Pagerank Algorithm ◽

Phrase Extraction ◽

Extraction Algorithm ◽

Syntactic Information ◽

Key Phrase Extraction

Background: As a known key phrase extraction algorithm, TextRank is an analogue of PageRank algorithm, which relied heavily on the statistics of term frequency in the manner of co-occurrence analysis. Objective: The frequency-based characteristic made it a neck-bottle for performance enhancement, and various improved TextRank algorithms were proposed in the recent years. Most of improvements incorporated semantic information into key phrase extraction algorithm and achieved improvement. Method: In this research, taking both syntactic and semantic information into consideration, we integrated syntactic tree algorithm and word embedding and put forward an algorithm of Word Embedding and Syntactic Information Algorithm (WESIA), which improved the accuracy of the TextRank algorithm. Results: By applying our method on a self-made test set and a public test set, the result implied that the proposed unsupervised key phrase extraction algorithm outperformed the other algorithms to some extent.

Download Full-text

Scalable Cross-lingual Document Similarity through Language-specific Concept Hierarchies

Proceedings of the 10th International Conference on Knowledge Capture - K-CAP '19 ◽

10.1145/3360901.3364444 ◽

2019 ◽

Author(s):

Carlos Badenes-Olmedo ◽

José Luis Redondo-García ◽

Oscar Corcho

Keyword(s):

Document Similarity ◽

Specific Concept ◽

Cross Lingual ◽

Concept Hierarchies

Download Full-text

Assessing Suitable Word Embedding Model for Malay Language through Intrinsic Evaluation

2020 International Conference on Computational Intelligence (ICCI) ◽

10.1109/icci51257.2020.9247707 ◽

2020 ◽

Author(s):

Yeong-Tsann Phua ◽

Kwang-Hooi Yew ◽

Oi-Mean Foong ◽

Matthew Yok-Wooi Teow

Keyword(s):

Download Full-text

A Custom Word Embedding Model for Clustering of Maintenance Records

IEEE Transactions on Industrial Informatics ◽

10.1109/tii.2021.3079521 ◽

2021 ◽

pp. 1-1

Author(s):

Abhijeet SANDEEP Bhardwaj ◽

Akash Deep ◽

Dharmaraj Veeramani ◽

Shiyu Zhou

Keyword(s):

Download Full-text