A Futuristic Framework for Financial Credit Score Prediction System using PSO based Feature Selection with Random Tree Data Classification Model

Optimal Deep Learning based Data Classification Model for Type-2 Diabetes Mellitus Diagnosis and Prediction System

International Journal of Innovative Technology and Exploring Engineering - Special Issue ◽

10.35940/ijitee.c8656.019320 ◽

2020 ◽

Vol 9 (3) ◽

pp. 1596-1604

Keyword(s):

Deep Learning ◽

Data Classification ◽

Research Area ◽

Classification Model ◽

Prediction System ◽

Pima Indians ◽

Significant Research ◽

Diabetes Mellitus Diagnosis ◽

Simulation Results

In recent days, deep learning models become a significant research area because of its applicability in diverse domains. In this paper, we employ an optimal deep neural network (DNN) based model for classifying diabetes disease. The DNN is employed for diagnosing the patient diseases effectively with better performance. To further improve the classifier efficiency, multilayer perceptron (MLP) is employed to remove the misclassified instance in the dataset. Then, the processed data is again provided as input to the DNN based classification model. The use of MLP significantly helps to remove the misclassified instances. The presented optimal data classification model is experimented on the PIMA Indians Diabetes dataset which holds the medical details of 768 patients under the presence of 8 attributes for every record. The obtained simulation results verified the superior nature of the presented model over the compared methods.

Download Full-text

MapReduce-based big data classification model using feature subset selection and hyperparameter tuned deep belief network

Scientific Reports ◽

10.1038/s41598-021-03019-y ◽

2021 ◽

Vol 11 (1) ◽

Author(s):

Surendran Rajendran ◽

Osamah Ibrahim Khalaf ◽

Youseef Alotaibi ◽

Saleh Alghamdi

Keyword(s):

Feature Selection ◽

Big Data ◽

Selection Process ◽

Data Classification ◽

Deep Belief Network ◽

Feature Subset Selection ◽

Classification Model ◽

Feature Subset ◽

Belief Network ◽

Big Data Classification

AbstractIn recent times, big data classification has become a hot research topic in various domains, such as healthcare, e-commerce, finance, etc. The inclusion of the feature selection process helps to improve the big data classification process and can be done by the use of metaheuristic optimization algorithms. This study focuses on the design of a big data classification model using chaotic pigeon inspired optimization (CPIO)-based feature selection with an optimal deep belief network (DBN) model. The proposed model is executed in the Hadoop MapReduce environment to manage big data. Initially, the CPIO algorithm is applied to select a useful subset of features. In addition, the Harris hawks optimization (HHO)-based DBN model is derived as a classifier to allocate appropriate class labels. The design of the HHO algorithm to tune the hyperparameters of the DBN model assists in boosting the classification performance. To examine the superiority of the presented technique, a series of simulations were performed, and the results were inspected under various dimensions. The resultant values highlighted the supremacy of the presented technique over the recent techniques.

Download Full-text

BHHO-TVS: A Binary Harris Hawks Optimizer with Time-Varying Scheme for Solving Data Classification Problems

Applied Sciences ◽

10.3390/app11146516 ◽

2021 ◽

Vol 11 (14) ◽

pp. 6516

Author(s):

Hamouda Chantar ◽

Thaer Thaher ◽

Hamza Turabieh ◽

Majdi Mafarja ◽

Alaa Sheta

Keyword(s):

Feature Selection ◽

Search Algorithm ◽

Gravitational Search Algorithm ◽

Data Classification ◽

Classification Model ◽

Model Complexity ◽

Binary Particle Swarm Optimization ◽

Time Varying ◽

Classification Problems ◽

Whale Optimization

Data classification is a challenging problem. Data classification is very sensitive to the noise and high dimensionality of the data. Being able to reduce the model complexity can help to improve the accuracy of the classification model performance. Therefore, in this research, we propose a novel feature selection technique based on Binary Harris Hawks Optimizer with Time-Varying Scheme (BHHO-TVS). The proposed BHHO-TVS adopts a time-varying transfer function that is applied to leverage the influence of the location vector to balance the exploration and exploitation power of the HHO. Eighteen well-known datasets provided by the UCI repository were utilized to show the significance of the proposed approach. The reported results show that BHHO-TVS outperforms BHHO with traditional binarization schemes as well as other binary feature selection methods such as binary gravitational search algorithm (BGSA), binary particle swarm optimization (BPSO), binary bat algorithm (BBA), binary whale optimization algorithm (BWOA), and binary salp swarm algorithm (BSSA). Compared with other similar feature selection approaches introduced in previous studies, the proposed method achieves the best accuracy rates on 67% of datasets.

Download Full-text

An Ensemble Model of Outlier Detection with Random Tree Data Classification for Financial Credit Scoring Prediction System

International Journal of Recent Technology and Engineering - 2 ◽

10.35940/ijrte.c5850.098319 ◽

2019 ◽

Vol 8 (3) ◽

pp. 7108-7114

Keyword(s):

Outlier Detection ◽

Credit Scoring ◽

Data Classification ◽

Classification Performance ◽

Financial Data ◽

Random Tree ◽

Financial Firms ◽

Finance Industry ◽

Validation Parameters ◽

Financial Credit

Recently, Financial Credit Scoring (FCS) becomes an essential process in the finance industry for assessing the creditworth of individual or financial firms. Several artificial intelligence (AI) models have been already presented for the classification of financial data. However, the credit as well as financial data generally comprises unwanted and repetitive features which lead to inefficient classification performance. To overcome this issue, in this paper, a new financial credit scoring (FCS) prediction model is developed by incorporating the process of outlier detection (OD) process (i.e. misclassified instance removal) prior to data classification. The presented FCS model involves two main phases namely misclassified instance removal using Naïve Bayes (NB) Tree and Random Tree (RT) based data classification. The presented NB-RT model is validated using the Benchmark German Credit dataset under different validation parameters. The extensive experiments exhibited that a maximum classification accuracy of 90.3% has been achieved by the proposed NB-RT model.

Download Full-text

A Constructive Fuzzy Representation Model for Heart Data Classification

Studies in Health Technology and Informatics - Public Health and Informatics ◽

10.3233/shti210111 ◽

2021 ◽

Author(s):

Michael D. Vasilakakis ◽

Dimitris K. Iakovidis ◽

George Koulaouzidis

Keyword(s):

Heart Failure ◽

Fuzzy Logic ◽

Feature Selection ◽

Data Classification ◽

Classification Model ◽

Mortality And Morbidity ◽

Home Telemonitoring ◽

Proposed Model ◽

Respective Treatment ◽

Fuzzy Representation

The early detection of Heart Disease (HD) and the prediction of Heart Failure (HF) via telemonitoring and can contribute to the reduction of patients’ mortality and morbidity as well as to the reduction of respective treatment costs. In this study we propose a novel classification model based on fuzzy logic applied in the context of HD detection and HF prediction. The proposed model considers that data can be represented by fuzzy phrases constructed from fuzzy words, which are fuzzy sets derived from data. Advantages of this approach include the robustness of data classification, as well as an intuitive way for feature selection. The accuracy of the proposed model is investigated on real home telemonitoring data and a publicly available dataset from UCI.

Download Full-text

Random Tree Data Stream Classifier With Sliding Window Estimator And Concept Drift

Bioscience Biotechnology Research Communications ◽

10.21786/bbrc/12.1/25 ◽

2019 ◽

Vol 12 (1) ◽

pp. 219-228

Author(s):

Ebtesam Almalki ◽

Manal Abdullah

Keyword(s):

Data Stream ◽

Concept Drift ◽

Sliding Window ◽

Random Tree ◽

Tree Data

Download Full-text

Children’s Activity Classification for Domestic Risk Scenarios Using Environmental Sound and a Bayesian Network

Healthcare ◽

10.3390/healthcare9070884 ◽

2021 ◽

Vol 9 (7) ◽

pp. 884

Author(s):

Antonio García-Domínguez ◽

Carlos E. Galván-Tejada ◽

Ramón F. Brena ◽

Antonio A. Aguileta ◽

Jorge I. Galván-Tejada ◽

...

Keyword(s):

Feature Selection ◽

Naive Bayes ◽

Naïve Bayes ◽

Classification Model ◽

Activity Classification ◽

Environmental Sound ◽

Non Invasive ◽

Akaike Criterion ◽

Data Source ◽

Feature Selection Techniques

Children’s healthcare is a relevant issue, especially the prevention of domestic accidents, since it has even been defined as a global health problem. Children’s activity classification generally uses sensors embedded in children’s clothing, which can lead to erroneous measurements for possible damage or mishandling. Having a non-invasive data source for a children’s activity classification model provides reliability to the monitoring system where it is applied. This work proposes the use of environmental sound as a data source for the generation of children’s activity classification models, implementing feature selection methods and classification techniques based on Bayesian networks, focused on the recognition of potentially triggering activities of domestic accidents, applicable in child monitoring systems. Two feature selection techniques were used: the Akaike criterion and genetic algorithms. Likewise, models were generated using three classifiers: naive Bayes, semi-naive Bayes and tree-augmented naive Bayes. The generated models, combining the methods of feature selection and the classifiers used, present accuracy of greater than 97% for most of them, with which we can conclude the efficiency of the proposal of the present work in the recognition of potentially detonating activities of domestic accidents.

Download Full-text

A hybrid Feature Selection Optimization Model for High Dimension Data Classification

IEEE Access ◽

10.1109/access.2021.3065341 ◽

2021 ◽

pp. 1-1

Author(s):

Mohammed Qaraad ◽

Souad Amjad ◽

Ibrahim I.M. Manhrawy ◽

Hanaa Fathi ◽

Bayoumi A. Hassan ◽

...

Keyword(s):

Feature Selection ◽

High Dimension ◽

Optimization Model ◽

Data Classification ◽

High Dimension Data

Download Full-text

R-HEFS: Rough set based Heterogeneous Ensemble Feature Selection Method for Medical data Classification

Artificial Intelligence in Medicine ◽

10.1016/j.artmed.2021.102049 ◽

2021 ◽

pp. 102049

Author(s):

Rubul Kumar Bania ◽

Anindya Halder

Keyword(s):

Feature Selection ◽

Rough Set ◽

Feature Selection Method ◽

Data Classification ◽

Selection Method ◽

Medical Data ◽

Medical Data Classification ◽

Heterogeneous Ensemble

Download Full-text

Intelligent Detection of False Information in Arabic Tweets Utilizing Hybrid Harris Hawks Based Feature Selection and Machine Learning Models

Symmetry ◽

10.3390/sym13040556 ◽

2021 ◽

Vol 13 (4) ◽

pp. 556

Author(s):

Thaer Thaher ◽

Mahmoud Saheb ◽

Hamza Turabieh ◽

Hamouda Chantar

Keyword(s):

Machine Learning ◽

Social Media ◽

Feature Selection ◽

Language Processing ◽

User Profile ◽

Vital Role ◽

Classification Model ◽

Fake News ◽

False Information ◽

Social Media Platforms

Fake or false information on social media platforms is a significant challenge that leads to deliberately misleading users due to the inclusion of rumors, propaganda, or deceptive information about a person, organization, or service. Twitter is one of the most widely used social media platforms, especially in the Arab region, where the number of users is steadily increasing, accompanied by an increase in the rate of fake news. This drew the attention of researchers to provide a safe online environment free of misleading information. This paper aims to propose a smart classification model for the early detection of fake news in Arabic tweets utilizing Natural Language Processing (NLP) techniques, Machine Learning (ML) models, and Harris Hawks Optimizer (HHO) as a wrapper-based feature selection approach. Arabic Twitter corpus composed of 1862 previously annotated tweets was utilized by this research to assess the efficiency of the proposed model. The Bag of Words (BoW) model is utilized using different term-weighting schemes for feature extraction. Eight well-known learning algorithms are investigated with varying combinations of features, including user-profile, content-based, and words-features. Reported results showed that the Logistic Regression (LR) with Term Frequency-Inverse Document Frequency (TF-IDF) model scores the best rank. Moreover, feature selection based on the binary HHO algorithm plays a vital role in reducing dimensionality, thereby enhancing the learning model’s performance for fake news detection. Interestingly, the proposed BHHO-LR model can yield a better enhancement of 5% compared with previous works on the same dataset.

Download Full-text