Text Clustering Algorithm of Co-Occurrence Word Based on Association-Rule Mining

According to the analysis of text feature, the document with co-occurrence words expresses very stronger and more accurately topic information. So this paper puts forward a text clustering algorithm of word co-occurrence based on association-rule mining. The method uses the association-rule mining to extract those word co-occurrences of expressing the topic information in the document. According to the co-occurrence words to build the modeling and co-occurrence word similarity measure, then this paper uses the hierarchical clustering algorithm based on word co-occurrence to realize text clustering. Experimental results show the method proposed in this paper improves the efficiency and accuracy of text clustering compared with other algorithms.

Download Full-text

Handling WSD using Hierarchical Clustering Algorithm with sentences

International Journal of Scientific Research in Science Engineering and Technology ◽

10.32628/ijsrset1841120 ◽

2018 ◽

pp. 83-88

Author(s):

Mohana Priya K ◽

Pooja Ragavi S ◽

Krishna Priya G

Keyword(s):

Hierarchical Clustering ◽

Similarity Measure ◽

Clustering Algorithm ◽

Clustering Algorithms ◽

Cosine Similarity Measure ◽

Hierarchical Clustering Algorithm ◽

Multiple Levels ◽

Pos Tagger ◽

Sentence Clustering ◽

The Right

Clustering is the process of grouping objects into subsets that have meaning in the context of a particular problem. It does not rely on predefined classes. It is referred to as an unsupervised learning method because no information is provided about the "right answer" for any of the objects. Many clustering algorithms have been proposed and are used based on different applications. Sentence clustering is one of best clustering technique. Hierarchical Clustering Algorithm is applied for multiple levels for accuracy. For tagging purpose POS tagger, porter stemmer is used. WordNet dictionary is utilized for determining the similarity by invoking the Jiang Conrath and Cosine similarity measure. Grouping is performed with respect to the highest similarity measure value with a mean threshold. This paper incorporates many parameters for finding similarity between words. In order to identify the disambiguated words, the sense identification is performed for the adjectives and comparison is performed. semcor and machine learning datasets are employed. On comparing with previous results for WSD, our work has improvised a lot which gives a percentage of 91.2%

Download Full-text

Understanding Causes of Low Voltage (LV) Faults in Electricity Distribution Network Using Association Rule Mining and Text Clustering

2019 IEEE International Conference on Environment and Electrical Engineering and 2019 IEEE Industrial and Commercial Power Systems Europe (EEEIC / I&CPS Europe) ◽

10.1109/eeeic.2019.8783949 ◽

2019 ◽

Cited By ~ 2

Author(s):

Charith Silva ◽

Mohamad Saraee

Keyword(s):

Association Rule ◽

Association Rule Mining ◽

Low Voltage ◽

Distribution Network ◽

Text Clustering ◽

Electricity Distribution ◽

Rule Mining ◽

Electricity Distribution Network

Download Full-text

Reduction of Redundant Rules in Association Rule Mining-Based Bug Assignment

International Journal of Reliability Quality and Safety Engineering ◽

10.1142/s0218539317400058 ◽

2017 ◽

Vol 24 (06) ◽

pp. 1740005 ◽

Cited By ~ 3

Author(s):

Meera Sharma ◽

Abhishek Tandon ◽

Madhu Kumari ◽

V. B. Singh

Keyword(s):

Operating System ◽

Association Rules ◽

Association Rule ◽

Association Rule Mining ◽

Clustering Algorithm ◽

Large Data ◽

Software Project ◽

Rule Mining ◽

Data Set ◽

Bug Reports

Bug triaging is a process to decide what to do with newly coming bug reports. In this paper, we have mined association rules for the prediction of bug assignee of a newly reported bug using different bug attributes, namely, severity, priority, component and operating system. To deal with the problem of large data sets, we have taken subsets of data set by dividing the large data set using [Formula: see text]-means clustering algorithm. We have used an Apriori algorithm in MATLAB to generate association rules. We have extracted the association rules for top 5 assignees in each cluster. The proposed method has been empirically validated on 14,696 bug reports of Mozilla open source software project, namely, Seamonkey, Firefox and Bugzilla. In our approach, we observe that taking on these attributes (severity, priority, component and operating system) as antecedents, essential rules are more than redundant rules, whereas in [M. Sharma and V. B. Singh, Clustering-based association rule mining for bug assignee prediction, Int. J. Business Intell. Data Mining 11(2) (2017) 130–150.] essential rules are less than redundant rules in every cluster. The proposed method provides an improvement over the existing techniques for bug assignment problem.

Download Full-text

Generating Road Accident Prediction Set with Road Accident Data Analysis Using Enhanced Expectation-Maximization Clustering Algorithm and Improved Association Rule Mining

Journal Européen des Systèmes Automatisés ◽

10.18280/jesa.520108 ◽

2019 ◽

Vol 52 (1) ◽

pp. 57-63

Author(s):

Sakham Babu ◽

Jebamalar Tamilselvi

Keyword(s):

Data Analysis ◽

Expectation Maximization ◽

Association Rule ◽

Association Rule Mining ◽

Clustering Algorithm ◽

Road Accident ◽

Rule Mining ◽

Accident Prediction ◽

Accident Data

Download Full-text

A Quantitative Association Rule Mining Algorithm Based on Clustering Algorithm

2006 IEEE International Conference on Systems, Man and Cybernetics ◽

10.1109/icsmc.2006.385264 ◽

2006 ◽

Cited By ~ 3

Author(s):

Toshihiko Watanabe ◽

Hirokazu Takahashi

Keyword(s):

Association Rule ◽

Association Rule Mining ◽

Clustering Algorithm ◽

Rule Mining ◽

Mining Algorithm ◽

Quantitative Association Rule

Download Full-text

A Novel Market Basket Analysis Using Adaptive Association Rule Mining Algorithm

International Journal of Scientific Research ◽

10.15373/22778179/sep2012/9 ◽

2012 ◽

Vol 1 (4) ◽

pp. 25-28

Author(s):

M.Dhanabhakyam M.Dhanabhakyam ◽

◽

Dr.M.Punithavalli Dr.M.Punithavalli

Keyword(s):

Association Rule ◽

Association Rule Mining ◽

Market Basket Analysis ◽

Rule Mining ◽

Market Basket ◽

Mining Algorithm

Download Full-text

Study of Various Parallel Implementations of Association Rule Mining Algorithm

American Journal Of Advanced Computing ◽

10.15864/ajac.v2i1.94 ◽

2015 ◽

Vol 2 (1) ◽

Author(s):

Sarbani Dasgupta

Keyword(s):

Association Rule ◽

Association Rule Mining ◽

Rule Mining ◽

Mining Algorithm ◽

Parallel Implementations

Download Full-text

Prediksi Code Defect Perangkat Lunak Dengan Metode Association Rule Mining dan Cumulative Support Thresholds

Jurnal Buana Informatika ◽

10.24002/jbi.v6i2.408 ◽

2015 ◽

Vol 6 (2) ◽

Author(s):

Rizal Setya Perdana ◽

Umi Laili Yuhana

Keyword(s):

Association Rule ◽

Association Rule Mining ◽

Rule Mining ◽

Program Code

Kualitas perangkat lunak merupakan salah satu penelitian pada bidangrekayasa perangkat lunak yang memiliki peranan yang cukup besar dalamterbangunnya sistem perangkat lunak yang berkualitas baik. Prediksi defectperangkat lunak yang disebabkan karena terdapat penyimpangan dari prosesspesifikasi atau sesuatu yang mungkin menyebabkan kegagalan dalam operasionaltelah lebih dari 30 tahun menjadi topik riset penelitian. Makalah ini akandifokuskan pada prediksi defect yang terjadi pada kode program (code defect).Metode penanganan permasalahan defect pada kode program akan memanfaatkanpola-pola kode perangkat lunak yang berpotensi menimbulkan defect pada data setNASA untuk memprediksi defect. Metode yang digunakan dalam pencarian polaadalah memanfaatkan Association Rule Mining dengan Cumulative SupportThresholds yang secara otomatis menghasilkan nilai support dan nilai confidencepaling optimal tanpa membutuhkan masukan dari pengguna. Hasil pengujian darihasil pemrediksian defect kode perangkat lunak secara otomatis memiliki nilaiakurasi 82,35%.

Download Full-text