IMPROVING CLASSIFICATION PERFORMANCE OF NEURO-FUZZY CLASSIFIER BY IMPUTING MISSING DATA

In medical data classification, if the size of data sets is small and if it contains multiple missing attribute values, in such cases improving classification performance is an important issue. The foremost objective of machine learning research is to improve the classification performance of the classifiers. The number of training instances provided for training must be sufficient in size. In the proposed algorithm, we substitute missing attribute values with attribute available domain values and generate additional training tuples that are in addition to original training tuples. These additional, plus original training samples provide sufficient data samples for learning. The neuro-fuzzy classifier trained on this dataset. The classification performance on test data for the neuro-fuzzy classifier is obtained using the k-fold cross-validation method. The proposed method attains around 2.8% and 3.61% improvement in classification accuracy for this classifier.

Download Full-text

Ant Lion Optimization Based Medical Data Classification Using Modified Neuro Fuzzy Classifier

Wireless Personal Communications ◽

10.1007/s11277-020-07919-6 ◽

2021 ◽

Author(s):

Balasaheb Tarle ◽

Sudarson Jena

Keyword(s):

Data Classification ◽

Medical Data ◽

Fuzzy Classifier ◽

Ant Lion Optimization ◽

Neuro Fuzzy ◽

Medical Data Classification ◽

Ant Lion

Download Full-text

Automated Characterization of Atheromatous Plaque in Intravascular Ultrasound Images Using Neuro Fuzzy Classifier

International Journal of Electronics and Telecommunications ◽

10.2478/v10177-012-0058-7 ◽

2012 ◽

Vol 58 (4) ◽

pp. 425-431 ◽

Cited By ~ 3

Author(s):

D. Selvathi ◽

N. Emimal ◽

Henry Selvaraj

Keyword(s):

Intravascular Ultrasound ◽

Human Error ◽

Human Life ◽

Classification Performance ◽

Atheromatous Plaque ◽

Ultrasound Images ◽

Fuzzy Classifier ◽

Neuro Fuzzy ◽

Calcified Tissues

Abstract The medical imaging field has grown significantly in recent years and demands high accuracy since it deals with human life. The idea is to reduce human error as much as possible by assisting physicians and radiologists with some automatic techniques. The use of artificial intelligent techniques has shown great potential in this field. Hence, in this paper the neuro fuzzy classifier is applied for the automated characterization of atheromatous plaque to identify the fibrotic, lipidic and calcified tissues in Intravascular Ultrasound images (IVUS) which is designed using sixteen inputs, corresponds to sixteen pixels of instantaneous scanning matrix, one output that tells whether the pixel under consideration is Fibrotic, Lipidic, Calcified or Normal pixel. The classification performance was evaluated in terms of sensitivity, specificity and accuracy and the results confirmed that the proposed system has potential in detecting the respective plaque with the average accuracy of 98.9%.

Download Full-text

Sparse Matrix Approach in Neural Networks for Effective Medical Data Sets Classifications

Journal of Basic and Applied Research in Biomedicine ◽

10.51152/jbarbiomed.v6i2.113 ◽

2020 ◽

Vol 6 (2) ◽

pp. 90-97

Author(s):

Sagir Masanawa ◽

Hamza Abubakar

Keyword(s):

Intelligent System ◽

Sparse Matrix ◽

Data Classification ◽

Medical Data ◽

Data Sets ◽

Matrix Approach ◽

Neural Network Learning ◽

Network Learning ◽

Hybrid Intelligent System ◽

Medical Data Classification

In this paper, a hybrid intelligent system that consists of the sparse matrix approach incorporated in neural network learning model as a decision support tool for medical data classification is presented. The main objective of this research is to develop an effective intelligent system that can be used by medical practitioners to accelerate diagnosis and treatment processes. The sparse matrix approach incorporated in neural network learning algorithm for scalability, minimize higher memory storage capacity usage, enhancing implementation time and speed up the analysis of the medical data classification problem. The hybrid intelligent system aims to exploit the advantages of the constituent models and, at the same time, alleviate their limitations. The proposed intelligent classification system maximizes the intelligently classification of medical data and minimizes the number of trends inaccurately identified. To evaluate the effectiveness of the hybrid intelligent system, three benchmark medical data sets, viz., Hepatitis, SPECT Heart and Cleveland Heart from the UCI Repository of Machine Learning, are used for evaluation. A number of useful performance metrics in medical applications which include accuracy, sensitivity, specificity. The results were analyzed and compared with those from other methods published in the literature. The experimental outcomes positively demonstrate that the hybrid intelligent system was effective in undertaking medical data classification tasks.

Download Full-text

Parametric methods for comparing the performance of two classification algorithms evaluated by k-fold cross validation on multiple data sets

Pattern Recognition ◽

10.1016/j.patcog.2016.12.018 ◽

2017 ◽

Vol 65 ◽

pp. 97-107 ◽

Cited By ~ 17

Author(s):

Tzu-Tsung Wong

Keyword(s):

Cross Validation ◽

Classification Algorithms ◽

Data Sets ◽

Parametric Methods ◽

Multiple Data ◽

Multiple Data Sets ◽

Fold Cross Validation

Download Full-text

A Method of Classification Performance Improvement Via a Strategy of Clustering-Based Data Elimination Integrated with k-Fold Cross-Validation

Arabian Journal for Science and Engineering ◽

10.1007/s13369-020-04972-y ◽

2020 ◽

Author(s):

Onur Inan ◽

Mustafa Serter Uzer

Keyword(s):

Performance Improvement ◽

Cross Validation ◽

Classification Performance ◽

Fold Cross Validation

Download Full-text

A new GIS-based data mining technique using an adaptive neuro-fuzzy inference system (ANFIS) and k-fold cross-validation approach for land subsidence susceptibility mapping

Natural Hazards ◽

10.1007/s11069-018-3449-y ◽

2018 ◽

Vol 94 (2) ◽

pp. 497-517 ◽

Cited By ~ 36

Author(s):

Omid Ghorbanzadeh ◽

Hashem Rostamzadeh ◽

Thomas Blaschke ◽

Khalil Gholaminia ◽

Jagannath Aryal

Keyword(s):

Data Mining ◽

Land Subsidence ◽

Fuzzy Inference System ◽

Cross Validation ◽

Fuzzy Inference ◽

Data Mining Technique ◽

Inference System ◽

Mining Technique ◽

Neuro Fuzzy ◽

Fold Cross Validation

Download Full-text

RESAMPLING METHODS IN SOFTWARE QUALITY CLASSIFICATION

International Journal of Software Engineering and Knowledge Engineering ◽

10.1142/s0218194012400037 ◽

2012 ◽

Vol 22 (02) ◽

pp. 203-223 ◽

Cited By ~ 7

Author(s):

WASIF AFZAL ◽

RICHARD TORKAR ◽

ROBERT FELDT

Keyword(s):

Software Engineering ◽

Software Quality ◽

Cross Validation ◽

Predictor Variables ◽

Primary Study ◽

Data Sets ◽

Resampling Methods ◽

Quality Classification ◽

Leave One Out ◽

Fold Cross Validation

In the presence of a number of algorithms for classification and prediction in software engineering, there is a need to have a systematic way of assessing their performances. The performance assessment is typically done by some form of partitioning or resampling of the original data to alleviate biased estimation. For predictive and classification studies in software engineering, there is a lack of a definitive advice on the most appropriate resampling method to use. This is seen as one of the contributing factors for not being able to draw general conclusions on what modeling technique or set of predictor variables are the most appropriate. Furthermore, the use of a variety of resampling methods make it impossible to perform any formal meta-analysis of the primary study results. Therefore, it is desirable to examine the influence of various resampling methods and to quantify possible differences. Objective and method: This study empirically compares five common resampling methods (hold-out validation, repeated random sub-sampling, 10-fold cross-validation, leave-one-out cross-validation and non-parametric bootstrapping) using 8 publicly available data sets with genetic programming (GP) and multiple linear regression (MLR) as software quality classification approaches. Location of (PF, PD) pairs in the ROC (receiver operating characteristics) space and area under an ROC curve (AUC) are used as accuracy indicators. Results: The results show that in terms of the location of (PF, PD) pairs in the ROC space, bootstrapping results are in the preferred region for 3 of the 8 data sets for GP and for 4 of the 8 data sets for MLR. Based on the AUC measure, there are no significant differences between the different resampling methods using GP and MLR. Conclusion: There can be certain data set properties responsible for insignificant differences between the resampling methods based on AUC. These include imbalanced data sets, insignificant predictor variables and high-dimensional data sets. With the current selection of data sets and classification techniques, bootstrapping is a preferred method based on the location of (PF, PD) pair data in the ROC space. Hold-out validation is not a good choice for comparatively smaller data sets, where leave-one-out cross-validation (LOOCV) performs better. For comparatively larger data sets, 10-fold cross-validation performs better than LOOCV.

Download Full-text

Adaptive neuro-fuzzy inference systems with k-fold cross-validation for energy expenditure predictions based on heart rate

Applied Ergonomics ◽

10.1016/j.apergo.2015.03.001 ◽

2015 ◽

Vol 50 ◽

pp. 68-78 ◽

Cited By ~ 9

Author(s):

Ahmet Kolus ◽

Daniel Imbeau ◽

Philippe-Antoine Dubé ◽

Denise Dubeau

Keyword(s):

Heart Rate ◽

Energy Expenditure ◽

Cross Validation ◽

Fuzzy Inference ◽

Fuzzy Inference Systems ◽

Neuro Fuzzy ◽

Inference Systems ◽

Fold Cross Validation

Download Full-text

Impression Classification of Endek (Balinese Fabric) Image Using K-Nearest Neighbors Method

Kinetik Game Technology Information System Computer Network Computing Electronics and Control ◽

10.22219/kinetik.v3i3.611 ◽

2018 ◽

pp. 213-220 ◽

Cited By ~ 1

Author(s):

Gede Aditra Pradnyana ◽

I Komang Agus Suryantara ◽

I Gede Mahendra Darmawiguna

Keyword(s):

Cross Validation ◽

Nearest Neighbors ◽

K Nearest Neighbors ◽

K Value ◽

Training Samples ◽

And Training ◽

Validation Testing ◽

Fold Cross Validation ◽

Learning Data

An impression can be interpreted as a psychological feeling toward a product and it plays an important role in decision making. Therefore, the understanding of the data in the domain of impressions will be very useful. This research had the objective of knowing the performance of K-Nearest Neighbors method to classify endek image impression using K-Fold Cross Validation method. The images were taken from 3 locations, namely CV. Artha Dharma, Agung Bali Collection, and Pengrajin Sri Rejeki. To get the image impression was done by consulting with an endek expert named Dr. D.A Tirta Ray, M.Si. The process of data mining was done by using K-Nearest Neighbors Method which was a classification method to a set of data based on learning data that had been classified previously and to classify new objects based on attributes and training samples. K-Fold Cross Validation testing obtained accuracy of 91% with K value in K-Nearest Neighbors of 3, 4, 7, 8.

Download Full-text

Web-Based Privacy-Preserving Multicenter Medical Data Analysis Tools Via Threshold Homomorphic Encryption: Design and Development Study

Journal of Medical Internet Research ◽

10.2196/22555 ◽

2020 ◽

Vol 22 (12) ◽

pp. e22555

Author(s):

Yao Lu ◽

Tianshu Zhou ◽

Yu Tian ◽

Shiqiang Zhu ◽

Jingsong Li

Keyword(s):

Machine Learning ◽

Logistic Regression ◽

Logistic Regression Model ◽

Cross Validation ◽

Homomorphic Encryption ◽

Privacy Preserving ◽

Medical Data ◽

Multiple Sources ◽

Model Training ◽

Fold Cross Validation

Background Data sharing in multicenter medical research can improve the generalizability of research, accelerate progress, enhance collaborations among institutions, and lead to new discoveries from data pooled from multiple sources. Despite these benefits, many medical institutions are unwilling to share their data, as sharing may cause sensitive information to be leaked to researchers, other institutions, and unauthorized users. Great progress has been made in the development of secure machine learning frameworks based on homomorphic encryption in recent years; however, nearly all such frameworks use a single secret key and lack a description of how to securely evaluate the trained model, which makes them impractical for multicenter medical applications. Objective The aim of this study is to provide a privacy-preserving machine learning protocol for multiple data providers and researchers (eg, logistic regression). This protocol allows researchers to train models and then evaluate them on medical data from multiple sources while providing privacy protection for both the sensitive data and the learned model. Methods We adapted a novel threshold homomorphic encryption scheme to guarantee privacy requirements. We devised new relinearization key generation techniques for greater scalability and multiplicative depth and new model training strategies for simultaneously training multiple models through x-fold cross-validation. Results Using a client-server architecture, we evaluated the performance of our protocol. The experimental results demonstrated that, with 10-fold cross-validation, our privacy-preserving logistic regression model training and evaluation over 10 attributes in a data set of 49,152 samples took approximately 7 minutes and 20 minutes, respectively. Conclusions We present the first privacy-preserving multiparty logistic regression model training and evaluation protocol based on threshold homomorphic encryption. Our protocol is practical for real-world use and may promote multicenter medical research to some extent.

Download Full-text