scholarly journals An Idea of a Rough Set Theory Based Document Classification System

Author(s):  
Masaki KUREMATSU
Author(s):  
M. Kurematsu

We should check whether there are any existing patent documents whose claims fall foul of our idea or innovation before we submit own idea or innovation as a patent. We need a lot of resource to do it, because there are a lot of existing patent documents. These days, we can submit the patent documents by a computer and the number of patent documents is increasing quickly. Therefore, people need a system to support this task. In order to meet this demand, I propose a framework of a system to classify patent documents in this paper. This system uses machine translation to deal with synonym and Rough Set theory to classify patent documents. First, it extracts decision rules by Rough Set Theory from labeled patent documents translated by machine translation. Then, it classifies unlabeled patent documents by estimating labels based on the weight of the matched rules. In this approach, the satisfactory index (SI), the coverage index (CI) and the Lift value are used as the weight of rules and they are compared with the maximum number, the total number and the Mahalanobis distance. I evaluated this idea by classifying Japanese patent documents using a prototype system based on this idea. In the evaluation, the accuracy was about 0.40 and the accuracy has not reached the practical level. Therefore I will apply this approach to other document classification task and improve it based on the analysis result of them.


2020 ◽  
Vol 3 (2) ◽  
pp. 1-21 ◽  
Author(s):  
Haresh Sharma ◽  
◽  
Kriti Kumari ◽  
Samarjit Kar ◽  
◽  
...  

2009 ◽  
Vol 11 (2) ◽  
pp. 139-144
Author(s):  
Feng CAO ◽  
Yunyan DU ◽  
Yong GE ◽  
Deyu LI ◽  
Wei WEN

Author(s):  
S. Arjun Raj ◽  
M. Vigneshwaran

In this article we use the rough set theory to generate the set of decision concepts in order to solve a medical problem.Based on officially published data by International Diabetes Federation (IDF), rough sets have been used to diagnose Diabetes.The lower and upper approximations of decision concepts and their boundary regions have been formulated here.


Sign in / Sign up

Export Citation Format

Share Document