A UMDA-Based Discretization Method for Continuous Attributes

Discretization of continuous attributes have played an important role in machine learning and data mining. They can not only improve the performance of the classifier, but also reduce the space of the storage. Univariate Marginal Distribution Algorithm is a modified Evolutionary Algorithms, which has some advantages over classical Evolutionary Algorithms such as the fast convergence speed and few parameters need to be tuned. In this paper, we proposed a bottom-up, global, dynamic, and supervised discretization method on the basis of Univariate Marginal Distribution Algorithm.The experimental results showed that the proposed method could effectively improve the accuracy of classifier.

Download Full-text

Improved minimum-minimum roughness algorithm for clustering categorical data

International Journal of ADVANCED AND APPLIED SCIENCES ◽

10.21833/ijaas.2021.10.006 ◽

2021 ◽

Vol 8 (10) ◽

pp. 43-50

Author(s):

Truong et al. ◽

Keyword(s):

Machine Learning ◽

Data Mining ◽

Hierarchical Clustering ◽

Categorical Data ◽

Clustering Algorithm ◽

Clustering Algorithms ◽

Experimental Results ◽

Data Sets ◽

Top Down ◽

Hierarchical Clustering Algorithm

Clustering is a fundamental technique in data mining and machine learning. Recently, many researchers are interested in the problem of clustering categorical data and several new approaches have been proposed. One of the successful and pioneering clustering algorithms is the Minimum-Minimum Roughness algorithm (MMR) which is a top-down hierarchical clustering algorithm and can handle the uncertainty in clustering categorical data. However, MMR tends to choose the category with less value leaf node with more objects, leading to undesirable clustering results. To overcome such shortcomings, this paper proposes an improved version of the MMR algorithm for clustering categorical data, called IMMR (Improved Minimum-Minimum Roughness). Experimental results on actual data sets taken from UCI show that the IMMR algorithm outperforms MMR in clustering categorical data.

Download Full-text

Deep Residual Network with Sparse Feedback for Image Restoration

Applied Sciences ◽

10.3390/app8122417 ◽

2018 ◽

Vol 8 (12) ◽

pp. 2417 ◽

Cited By ~ 6

Author(s):

Zhenyu Guo ◽

Yujuan Sun ◽

Muwei Jian ◽

Xiaofeng Zhang

Keyword(s):

Neural Network ◽

Image Restoration ◽

Deep Neural Network ◽

Convergence Speed ◽

Fast Convergence ◽

Experimental Results ◽

Residual Network ◽

Unknown Parameters ◽

Noisy Images ◽

Blurred Images

A deep neural network is difficult to train due to a large number of unknown parameters. To increase trainable performance, we present a moderate depth residual network for the restoration of motion blurring and noisy images. The proposed network has only 10 layers, and the sparse feedbacks are added in the middle and the last layers, which are called FbResNet. FbResNet has fast convergence speed and effective denoising performance. In addition, it can also reduce the artificial Mosaic trace at the seam of patches, and visually pleasant output results can be produced from the blurred images or noisy images. Experimental results show the effectiveness of our designed model and method.

Download Full-text

Ensemble Hybrid K- Means and DBSCAN Clustering Algorithm – HDKA for Cancer Dataset

International Journal of Recent Technology and Engineering - 2 ◽

10.35940/ijrte.d8257.118419 ◽

2019 ◽

Vol 8 (4) ◽

pp. 6036-6040

Keyword(s):

Machine Learning ◽

Data Mining ◽

Clustering Algorithm ◽

Clustering Algorithms ◽

Experimental Results ◽

Cancer Dataset ◽

Dbscan Clustering ◽

Selection Of

Data Mining is the foremost vital space of analysis and is pragmatically utilized in totally different domains, It becomes a highly demanding field because huge amounts of data have been collected in various applications. The database can be clustered in more number of ways depending on the clustering algorithm used, parameter settings and other factors. Multiple clustering algorithms can be combined to get the final partitioning of data which provides better clustering results. In this paper, Ensemble hybrid KMeans and DBSCAN (HDKA) algorithm has been proposed to overcome the drawbacks of DBSCAN and KMeans clustering algorithms. The performance of the proposed algorithm improves the selection of centroid points through the centroid selection strategy.For experimental results we have used two dataset Colon and Leukemia from UCI machine learning repository.

Download Full-text

Multi-agent based data mining aggregation approaches using machine learning techniques

International Journal of Engineering & Technology ◽

10.14419/ijet.v7i3.9631 ◽

2018 ◽

Vol 7 (3) ◽

pp. 1136

Author(s):

V Devasekhar ◽

P Natarajan

Keyword(s):

Machine Learning ◽

Data Mining ◽

Optimization Algorithm ◽

Experimental Results ◽

Machine Learning Techniques ◽

Statistical Features ◽

Agent Based ◽

Cluster Data ◽

Learning Techniques ◽

Multi Agent

Data Mining is an extraction of important knowledge from the various databases using different kinds of approaches. In the multi agent, distributed mining the knowledge aggregation is one of challenging task. This paper tries to optimize the problem of aggregation and boils down into the solution, which is derived based on the machine learning statistical features of each agents. However, in this paper a novel optimization algorithm called Multi-Agent Based Data Mining Aggregation (MABDA) is used for present day’s scenarios. The MBADA algorithm has agents which collect extracted knowledge and summarizes the various levels of agent’s cluster data into an aggregation with maximum accuracies. To prove the effectiveness of the proposed algorithm, the experimental results are compared with relatively existing methods.

Download Full-text

An Improved Sentiment Analysis Approach to Detect Radical Content on Twitter

International Journal of Information Technology and Web Engineering ◽

10.4018/ijitwe.2021100103 ◽

2021 ◽

Vol 16 (4) ◽

pp. 52-73

Author(s):

kamel Ahsene Djaballah ◽

Kamel Boukhalfa ◽

Omar Boussaid ◽

Yassine Ramdane

Keyword(s):

Machine Learning ◽

Data Mining ◽

Social Networks ◽

Fuzzy Logic ◽

Sentiment Analysis ◽

Experimental Results ◽

Data Mining Algorithms ◽

Terrorist Groups ◽

Radical Content ◽

Mining Algorithms

Social networks are used by terrorist groups and people who support them to propagate their ideas, ideologies, or doctrines and share their views on terrorism. To analyze tweets related to terrorism, several studies have been proposed in the literature. Some works rely on data mining algorithms; others use lexicon-based or machine learning sentiment analysis. Some recent works adopt other methods that combine multi-techniques. This paper proposes an improved approach for sentiment analysis of radical content related to terrorist activity on Twitter. Unlike other solutions, the proposed approach focuses on using a dictionary of weighted terms, the Word2vec method, and trigrams, with a classification based on fuzzy logic. The authors have conducted experiments with 600 manually annotated tweets and 200,000 automatically collected tweets in English and Arabic to evaluate this approach. The experimental results revealed that the new technique provides between 75% to 78% of precision for radicality detection and 61% to 64% to detect radicality degrees.

Download Full-text

A Customized Non-Exclusive Clustering Algorithm for News Recommendation Systems

Journal of University of Babylon for Pure and Applied Sciences ◽

10.29196/jubpas.v27i1.2192 ◽

2019 ◽

Vol 27 (1) ◽

pp. 368-379 ◽

Cited By ~ 1

Author(s):

Asghar Darvishy ◽

Hamidah Ibrahim ◽

Fatimah Sidi ◽

Aida Mustapha

Keyword(s):

Machine Learning ◽

Data Mining ◽

Clustering Algorithm ◽

Recommendation Systems ◽

Experimental Results ◽

News Item ◽

Real Dataset ◽

News Websites ◽

News Recommendation

Clustering is one of the main tasks in machine learning and data mining and is being utilized in many applications including news recommendation systems. In this paper, we propose a new non-exclusive clustering algorithm named Ordered Clustering (OC) with the aim is to increase the accuracy of news recommendation for online users. The basis of OC is a new initialization technique that groups news items into clusters based on the highest similarities between news items to accommodate news nature in which a news item can belong to different categories. Hence, in OC, multiple memberships in clusters are allowed. An experiment is carried out using a real dataset which is collected from the news websites. The experimental results demonstrated that the OC outperforms the k-means algorithm with respect to Precision, Recall, and F1-Score.

Download Full-text

Forecasting of Campus Placement for Students Using Ensemble Voting Classifier

Asian Journal of Research in Computer Science ◽

10.9734/ajrcos/2020/v5i430138 ◽

2020 ◽

pp. 1-12

Author(s):

Shawni Dutta ◽

Samir Kumar Bandyopadhyay

Keyword(s):

Machine Learning ◽

Data Mining ◽

Knowledge Discovery ◽

Academic Career ◽

Experimental Results ◽

Supervised Machine Learning ◽

Machine Learning Technique ◽

Ensemble Approach ◽

Forecasting Method ◽

Learning Technique

Campus placement is a measure of students’ performance in a course. A forecasting method is proposed in this paper to predict possible campus placement of any institution. Data mining and knowledge discovery processes on academic career of students are applied. Supervised machine learning technique based classifiers are used for achieving this process. It uses an ensemble approach based voting classifier for choosing best classifier models to achieve better result over other classifiers. Experimental results have indicated 86.05% accuracy of ensemble based approach which is significantly better over other classifiers.

Download Full-text