Improving TextRank Algorithm for Automatic Keyword Extraction with Tolerance Rough Set

Author(s):  
Dong Qiu ◽  
Qin Zheng
2013 ◽  
Vol 2 (4) ◽  
pp. 33-46 ◽  
Author(s):  
P. K. Nizar Banu ◽  
H. Hannah Inbarani

As the micro array databases increases in dimension and results in complexity, identifying the most informative genes is a challenging task. Such difficulty is often related to the huge number of genes with very few samples. Research in medical data mining addresses this problem by applying techniques from data mining and machine learning to the micro array datasets. In this paper Unsupervised Tolerance Rough Set based Quick Reduct (U-TRS-QR), a diverse feature selection algorithm, which extends the existing equivalent rough sets for unsupervised learning, is proposed. Genes selected by the proposed method leads to a considerably improved class predictions in wide experiments on two gene expression datasets: Brain Tumor and Colon Cancer. The results indicate consistent improvement among 12 classifiers.


2015 ◽  
Vol 67 ◽  
pp. 130-137 ◽  
Author(s):  
Cenker Sengoz ◽  
Sheela Ramanna

2021 ◽  
Vol 2021 ◽  
pp. 1-9
Author(s):  
Ruiteng Yan ◽  
Dong Qiu ◽  
Haihuan Jiang

Sentence similarity calculation is one of the important foundations of natural language processing. The existing sentence similarity calculation measurements are based on either shallow semantics with the limitation of inadequately capturing latent semantics information or deep learning algorithms with the limitation of supervision. In this paper, we improve the traditional tolerance rough set model, with the advantages of lower time complexity and becoming incremental compared to the traditional one. And then we propose a sentence similarity computation model from the perspective of uncertainty of text data based on the probabilistic tolerance rough set model. It has the ability of mining latent semantics information and is unsupervised. Experiments on SICK2014 task and STSbenchmark dataset to calculate sentence similarity identify a significant and efficient performance of our model.


2017 ◽  
Vol 13 (4) ◽  
pp. 38-55 ◽  
Author(s):  
Han Ke

In this paper, we present a new extreme learning machine network structure on the basis of tolerance rough set. The purpose of this paper is to realize the high-efficiency and multi-dimensional ELM network structure. Various published algorithms have been applied to breast cancer datasets, but rough set is a fairly new intelligent technique that applies to predict breast cancer recurrence. We analyze Ljubljana Breast Cancer Dataset, firstly, obtain lower and upper approximations and calculate the accuracy and quality of the classification. The high values of the quality of classification and accuracy prove that the attributes selected can well approximate the classification. Rough sets approach is established to solve the prolem of tolerance.


Author(s):  
Niladri Chatterjee ◽  
Aayush Singha Roy ◽  
Nidhika Yadav

The present work proposes an application of Soft Rough Set and its span for unsupervised keyword extraction. In recent times Soft Rough Sets are being applied in various domains, though none of its applications are in the area of keyword extraction. On the other hand, the concept of Rough Set based span has been developed for improved efficiency in the domain of extractive text summarization. In this work we amalgamate these two techniques, called Soft Rough Set based Span (SRS), to provide an effective solution for keyword extraction from texts. The universe for Soft Rough Set is taken to be a collection of words from the input texts. SRS provides an ideal platform for identifying the set of keywords from the input text which cannot always be defined clearly and unambiguously. The proposed technique uses greedy algorithm for computing spanning sets. The experimental results suggest that extraction of keywords using the proposed scheme gives consistent results across different domains. Also, it has been found to be more efficient in comparison with several existing unsupervised techniques.


2020 ◽  
pp. 263-282
Author(s):  
Han Ke

In this paper, we present a new extreme learning machine network structure on the basis of tolerance rough set. The purpose of this paper is to realize the high-efficiency and multi-dimensional ELM network structure. Various published algorithms have been applied to breast cancer datasets, but rough set is a fairly new intelligent technique that applies to predict breast cancer recurrence. We analyze Ljubljana Breast Cancer Dataset, firstly, obtain lower and upper approximations and calculate the accuracy and quality of the classification. The high values of the quality of classification and accuracy prove that the attributes selected can well approximate the classification. Rough sets approach is established to solve the prolem of tolerance.


Sign in / Sign up

Export Citation Format

Share Document