A UMDA-Based Discretization Method for Continuous Attributes

2011 ◽  
Vol 403-408 ◽  
pp. 1834-1838
Author(s):  
Jing Zhao ◽  
Chong Zhao Han ◽  
Bin Wei ◽  
De Qiang Han

Discretization of continuous attributes have played an important role in machine learning and data mining. They can not only improve the performance of the classifier, but also reduce the space of the storage. Univariate Marginal Distribution Algorithm is a modified Evolutionary Algorithms, which has some advantages over classical Evolutionary Algorithms such as the fast convergence speed and few parameters need to be tuned. In this paper, we proposed a bottom-up, global, dynamic, and supervised discretization method on the basis of Univariate Marginal Distribution Algorithm.The experimental results showed that the proposed method could effectively improve the accuracy of classifier.

2021 ◽  
Vol 8 (10) ◽  
pp. 43-50
Author(s):  
Truong et al. ◽  

Clustering is a fundamental technique in data mining and machine learning. Recently, many researchers are interested in the problem of clustering categorical data and several new approaches have been proposed. One of the successful and pioneering clustering algorithms is the Minimum-Minimum Roughness algorithm (MMR) which is a top-down hierarchical clustering algorithm and can handle the uncertainty in clustering categorical data. However, MMR tends to choose the category with less value leaf node with more objects, leading to undesirable clustering results. To overcome such shortcomings, this paper proposes an improved version of the MMR algorithm for clustering categorical data, called IMMR (Improved Minimum-Minimum Roughness). Experimental results on actual data sets taken from UCI show that the IMMR algorithm outperforms MMR in clustering categorical data.


2018 ◽  
Vol 8 (12) ◽  
pp. 2417 ◽  
Author(s):  
Zhenyu Guo ◽  
Yujuan Sun ◽  
Muwei Jian ◽  
Xiaofeng Zhang

A deep neural network is difficult to train due to a large number of unknown parameters. To increase trainable performance, we present a moderate depth residual network for the restoration of motion blurring and noisy images. The proposed network has only 10 layers, and the sparse feedbacks are added in the middle and the last layers, which are called FbResNet. FbResNet has fast convergence speed and effective denoising performance. In addition, it can also reduce the artificial Mosaic trace at the seam of patches, and visually pleasant output results can be produced from the blurred images or noisy images. Experimental results show the effectiveness of our designed model and method.


2019 ◽  
Vol 8 (4) ◽  
pp. 6036-6040

Data Mining is the foremost vital space of analysis and is pragmatically utilized in totally different domains, It becomes a highly demanding field because huge amounts of data have been collected in various applications. The database can be clustered in more number of ways depending on the clustering algorithm used, parameter settings and other factors. Multiple clustering algorithms can be combined to get the final partitioning of data which provides better clustering results. In this paper, Ensemble hybrid KMeans and DBSCAN (HDKA) algorithm has been proposed to overcome the drawbacks of DBSCAN and KMeans clustering algorithms. The performance of the proposed algorithm improves the selection of centroid points through the centroid selection strategy.For experimental results we have used two dataset Colon and Leukemia from UCI machine learning repository.


2018 ◽  
Vol 7 (3) ◽  
pp. 1136
Author(s):  
V Devasekhar ◽  
P Natarajan

Data Mining is an extraction of important knowledge from the various databases using different kinds of approaches. In the multi agent, distributed mining the knowledge aggregation is one of challenging task. This paper tries to optimize the problem of aggregation and boils down into the solution, which is derived based on the machine learning statistical features of each agents. However, in this paper a novel optimization algorithm called Multi-Agent Based Data Mining Aggregation (MABDA) is used for present day’s scenarios. The MBADA algorithm has agents which collect extracted knowledge and summarizes the various levels of agent’s cluster data into an aggregation with maximum accuracies. To prove the effectiveness of the proposed algorithm, the experimental results are compared with relatively existing methods. 


Author(s):  
kamel Ahsene Djaballah ◽  
Kamel Boukhalfa ◽  
Omar Boussaid ◽  
Yassine Ramdane

Social networks are used by terrorist groups and people who support them to propagate their ideas, ideologies, or doctrines and share their views on terrorism. To analyze tweets related to terrorism, several studies have been proposed in the literature. Some works rely on data mining algorithms; others use lexicon-based or machine learning sentiment analysis. Some recent works adopt other methods that combine multi-techniques. This paper proposes an improved approach for sentiment analysis of radical content related to terrorist activity on Twitter. Unlike other solutions, the proposed approach focuses on using a dictionary of weighted terms, the Word2vec method, and trigrams, with a classification based on fuzzy logic. The authors have conducted experiments with 600 manually annotated tweets and 200,000 automatically collected tweets in English and Arabic to evaluate this approach. The experimental results revealed that the new technique provides between 75% to 78% of precision for radicality detection and 61% to 64% to detect radicality degrees.


2019 ◽  
Vol 27 (1) ◽  
pp. 368-379 ◽  
Author(s):  
Asghar Darvishy ◽  
Hamidah Ibrahim ◽  
Fatimah Sidi ◽  
Aida Mustapha

Clustering is one of the main tasks in machine learning and data mining and is being utilized in many applications including news recommendation systems. In this paper, we propose a new non-exclusive clustering algorithm named Ordered Clustering (OC) with the aim is to increase the accuracy of news recommendation for online users. The basis of OC is a new initialization technique that groups news items into clusters based on the highest similarities between news items to accommodate news nature in which a news item can belong to different categories. Hence, in OC, multiple memberships in clusters are allowed. An experiment is carried out using a real dataset which is collected from the news websites. The experimental results demonstrated that the OC outperforms the k-means algorithm with respect to Precision, Recall, and F1-Score.


Author(s):  
Shawni Dutta ◽  
Samir Kumar Bandyopadhyay

Campus placement is a measure of students’ performance in a course. A forecasting method is proposed in this paper to predict possible campus placement of any institution. Data mining and knowledge discovery processes on academic career of students are applied. Supervised machine learning technique based classifiers are used for achieving this process. It uses an ensemble approach based voting classifier for choosing best classifier models to achieve better result over other classifiers. Experimental results have indicated 86.05% accuracy of ensemble based approach which is significantly better over other classifiers.


2020 ◽  
Author(s):  
Mohammed J. Zaki ◽  
Wagner Meira, Jr
Keyword(s):  

Sign in / Sign up

Export Citation Format

Share Document