Vocal Feature Extraction-Based Artificial Intelligent Model for Parkinson’s Disease Detection

As a neurodegenerative disorder, Parkinson’s disease (PD) affects the nerve cells of the human brain. Early detection and treatment can help to relieve the symptoms of PD. Recent PD studies have extracted the features from vocal disorders as a harbinger for PD detection, as patients face vocal changes and impairments at the early stages of PD. In this study, two hybrid models based on a Support Vector Machine (SVM) integrating with a Principal Component Analysis (PCA) and a Sparse Autoencoder (SAE) are proposed to detect PD patients based on their vocal features. The first model extracted and reduced the principal components of vocal features based on the explained variance of each feature using PCA. For the first time, the second model used a novel Deep Neural Network (DNN) of an SAE, consisting of multiple hidden layers with L1 regularization to compress the vocal features into lower-dimensional latent space. In both models, reduced features were fed into the SVM as inputs, which performed classification by learning hyperplanes, along with projecting the data into a higher dimension. An F1-score, a Mathews Correlation Coefficient (MCC), and a Precision-Recall curve were used, along with accuracy to evaluate the proposed models due to highly imbalanced data. With its highest accuracy of 0.935, F1-score of 0.951, and MCC value of 0.788, the probing results show that the proposed model of the SAE-SVM surpassed not only the former model of the PCA-SVM and other standard models including Multilayer Perceptron (MLP), Extreme Gradient Boosting (XGBoost), K-Nearest Neighbor (KNN), and Random Forest (RF), but also surpassed two recent studies using the same dataset. Oversampling and balancing the dataset with SMOTE boosted the performance of the models.

Download Full-text

Bacterial Immunogenicity Prediction by Machine Learning Methods

Vaccines ◽

10.3390/vaccines8040709 ◽

2020 ◽

Vol 8 (4) ◽

pp. 709

Author(s):

Ivan Dimitrov ◽

Nevena Zaharieva ◽

Irini Doytchinova

Keyword(s):

Machine Learning ◽

Nearest Neighbor ◽

Predictive Ability ◽

Initial Step ◽

Majority Voting ◽

Gradient Boosting ◽

Support Vector ◽

K Nearest Neighbor ◽

Test Set ◽

Extreme Gradient Boosting

The identification of protective immunogens is the most important and vigorous initial step in the long-lasting and expensive process of vaccine design and development. Machine learning (ML) methods are very effective in data mining and in the analysis of big data such as microbial proteomes. They are able to significantly reduce the experimental work for discovering novel vaccine candidates. Here, we applied six supervised ML methods (partial least squares-based discriminant analysis, k nearest neighbor (kNN), random forest (RF), support vector machine (SVM), random subspace method (RSM), and extreme gradient boosting) on a set of 317 known bacterial immunogens and 317 bacterial non-immunogens and derived models for immunogenicity prediction. The models were validated by internal cross-validation in 10 groups from the training set and by the external test set. All of them showed good predictive ability, but the xgboost model displays the most prominent ability to identify immunogens by recognizing 84% of the known immunogens in the test set. The combined RSM-kNN model was the best in the recognition of non-immunogens, identifying 92% of them in the test set. The three best performing ML models (xgboost, RSM-kNN, and RF) were implemented in the new version of the server VaxiJen, and the prediction of bacterial immunogens is now based on majority voting.

Download Full-text

A Deep Learning Approach for Classification and Diagnosis of Parkinson’s Disease

10.21203/rs.3.rs-254647/v1 ◽

2021 ◽

Author(s):

Monika Jyotiyana ◽

Nishtha Kesswani ◽

Munish Kumar

Keyword(s):

Parkinson’S Disease ◽

Parkinson's Disease ◽

Deep Learning ◽

Nearest Neighbor ◽

Health Sector ◽

Classification Model ◽

Support Vector ◽

Learning Approaches ◽

K Nearest Neighbor ◽

Linear Discriminant

Abstract Deep learning techniques are playing an important role in the classification and prediction of diseases. Undoubtedly deep learning has a promising future in the health sector, especially in medical imaging. The popularity of deep learning approaches is because of their ability to handle a large amount of data related to the patients with accuracy, reliability in a short span of time. However, the practitioners may take time in analyzing and generating reports. In this paper, we have proposed a Deep Neural Network-based classification model for Parkinson’s disease. Our proposed method is one such good example giving faster and more accurate results for the classification of Parkinson’s disease patients with excellent accuracy of 94.87%. Based on the attributes of the dataset of the patient, the model can be used for the identification of Parkinsonism's. We have also compared the results with other existing approaches like Linear Discriminant Analysis, Support Vector Machine, K-Nearest Neighbor, Decision Tree, Classification and Regression Trees, Random Forest, Linear Regression, Logistic Regression, Multi-Layer Perceptron, and Naive Bayes.

Download Full-text

Application of Fuzzy K-Nearest Neighbor (FKNN) to Detect the Parkinson’s Disease

InPrime: Indonesian Journal of Pure and Applied Mathematics ◽

10.15408/inprime.v1i1.12827 ◽

2019 ◽

Vol 1 (1) ◽

Author(s):

L.N. Desinaini ◽

Azizatul Mualimah ◽

Dian C. R. Novitasari ◽

Moh. Hafiyusholeh

Keyword(s):

Machine Learning ◽

Parkinson’S Disease ◽

Principal Component Analysis ◽

Parkinson's Disease ◽

Nearest Neighbor ◽

Principal Component ◽

Component Analysis ◽

Training Data ◽

K Nearest Neighbor ◽

Positive Data

AbstractParkinson’s disease is a neurological disorder in which there is a gradual loss of brain cells that make and store dopamine. Researchers estimate that four to six million people worldwide, are living with Parkinson’s. The average age of patients is 60 years old, but some are diagnosed at age 40 or even younger and the worst thing is some patients are late to find out that they have Parkinson's disease. In this paper, we present a diagnosis system based on Fuzzy K-Nearest Neighbor (FKNN) to detect Parkinson’s disease. We use Parkinson’s disease dataset taken from UCI Machine Learning Repository. The first step is normalize the Parkinson’s disease dataset and analyze using Principal Component Analysis (PCA). The result shows that there are four new factors that influence Parkinson’s disease with total variance is 85.719%. In classification step, we use several percentage of training data to classify (detect) the Parkinson's disease i.e. 50%, 60%, 70%, 75%, 80% and 90%. We also use k = 3, 5, 7, and 9. The classification result shows that the highest accuracy obtained for the percentage of training data is 90% and k = 5, where 19 are correctly classified i.e. 14 positive data and 5 negative data, while 1 positive data is classified incorrectly.Keywords: Parkinson's disease; Fuzzy K-Nearest Neighbor; Principal Component Analysis. AbstrakPenyakit Parkinson merupakan kelainan sel saraf pada otak yang menyebabkan hilangnya dopamin pada otak. Para peneliti mengestimasi bahwa, empat sampai enam juta orang di dunia, menderita Parkinson. Penyakit ini rata-rata diderita oleh pasien berusia 60 tahun, namun beberapa orang terdeteksi saat berusia 40 tahun atau lebih muda dan hal terburuk adalah seseorang terlambat untuk mendeteksinya. Di dalam artikel ini, kami menyajikan sistem diagnosa penyakit Parkinson menggunakan metode Fuzzy K-Nearest Neighbor (FKNN). Kami menggunakan Data uji yang diperoleh dari UCI Machine Learning Repository yang telah banyak diterapkan pada masalah klasifikasi. Tahapan pertama yang kami lakukan adalah menormalisasi data kemudian menganalisisnya menggunakan Analisis Komponen Utama (Principal Component Analysis). Hasil Analisis Komponen Utama menunjukkan bahwa terdapat empat factor baru yang mempengaruhi penyakit Parkinson dengan variansi total 87,719%. Pada tahap klasifikasi, kami menggunakan beberapa prosentase data latih untuk mendeteksi penyakit yaitu 50%, 60%, 70%, 75%, 80% and 90%. Selain itu, kami menggunakan beberapa nilai k yaitu 3, 5, 7, and 9. Hasil menunjukkan bahwa klasifikasi dengan akurasi tertinggi diperoleh untuk 90% data latih dengan k = 5, dimana 19 diklasifikasikan secara tepat yaitu 14 data positif dan 5 data negatif, sedangkan satu data positif tidak diklasifikasikan dengan tepat.Keywords: penyakit Parkinson; Fuzzy K-Nearest Neighbor; Analisis Komponen Utama.

Download Full-text

Severity Level Diagnosis of Parkinson’s Disease by Ensemble K-Nearest Neighbor Under Imbalanced Data

Expert Systems with Applications ◽

10.1016/j.eswa.2021.116113 ◽

2021 ◽

pp. 116113

Author(s):

Huan Zhao ◽

Ruixue Wang ◽

Yaguo Lei ◽

Wei-Hsin Liao ◽

Hongmei Cao ◽

...

Keyword(s):

Parkinson’S Disease ◽

Parkinson's Disease ◽

Nearest Neighbor ◽

Imbalanced Data ◽

K Nearest Neighbor ◽

Severity Level

Download Full-text

Application of Artificial Intelligence and Machine Learning Techniques in Classifying Extent of Dementia Across Alzheimer's Image Data

International Journal of Quantitative Structure-Property Relationships ◽

10.4018/ijqspr.2021040103 ◽

2021 ◽

Vol 6 (2) ◽

pp. 29-46

Author(s):

Robin Ghosh ◽

Anirudh Reddy Cingreddy ◽

Venkata Melapu ◽

Sravanthi Joginipelli ◽

Supratik Kar

Keyword(s):

Neural Network ◽

Machine Learning ◽

Nearest Neighbor ◽

Image Data ◽

Machine Learning Techniques ◽

Gradient Boosting ◽

Support Vector ◽

K Nearest Neighbor ◽

Mild Dementia ◽

Extreme Gradient Boosting

Alzheimer's disease (AD) is one of the most common forms of dementia and the sixth-leading cause of death in older adults. The presented study has illustrated the applications of deep learning (DL) and associated methods, which could have a broader impact on identifying dementia stages and may guide therapy in the future for multiclass image detection. The studied datasets contain around 6,400 magnetic resonance imaging (MRI) images, each segregated into the severity of Alzheimer's classes: mild dementia, very mild dementia, non-dementia, moderate dementia. These four image specifications were used to classify the dementia stages in each patient applying the convolutional neural network (CNN) algorithm. Employing the CNN-based in silico model, the authors successfully classified and predicted the different AD stages and got around 97.19% accuracy. Again, machine learning (ML) techniques like extreme gradient boosting (XGB), support vector machine (SVM), k-nearest neighbor (KNN), and artificial neural network (ANN) offered accuracy of 96.62%, 96.56%, 94.62, and 89.88%, respectively.

Download Full-text

IgA Nephropathy Prediction in Children with Machine Learning Algorithms

Future Internet ◽

10.3390/fi12120230 ◽

2020 ◽

Vol 12 (12) ◽

pp. 230

Author(s):

Ping Zhang ◽

Rongqin Wang ◽

Nianfeng Shi

Keyword(s):

Machine Learning ◽

Nearest Neighbor ◽

Learning Algorithms ◽

Immunoglobulin A ◽

Machine Learning Algorithms ◽

Gradient Boosting ◽

Support Vector ◽

K Nearest Neighbor ◽

Chi Square ◽

Extreme Gradient Boosting

Immunoglobulin A nephropathy (IgAN) is the most common primary glomerular disease all over the world and it is a major cause of renal failure. IgAN prediction in children with machine learning algorithms has been rarely studied. We retrospectively analyzed the electronic medical records from the Nanjing Eastern War Zone Hospital, chose eXtreme Gradient Boosting (XGBoost), random forest (RF), CatBoost, support vector machines (SVM), k-nearest neighbor (KNN), and extreme learning machine (ELM) models in order to predict the probability that the patient would not reach or reach end-stage renal disease (ESRD) within five years, used the chi-square test to select the most relevant 16 features as the input of the model, and designed a decision-making system (DMS) of IgAN prediction in children that is based on XGBoost and Django framework. The receiver operating characteristic (ROC) curve was used in order to evaluate the performance of the models and XGBoost had the best performance by comparison. The AUC value, accuracy, precision, recall, and f1-score of XGBoost were 85.11%, 78.60%, 75.96%, 76.70%, and 76.33%, respectively. The XGBoost model is useful for physicians and pediatric patients in providing predictions regarding IgAN. As an advantage, a DMS can be designed based on the XGBoost model to assist a physician to effectively treat IgAN in children for preventing deterioration.

Download Full-text

A hybrid evolutionary learning classification for robot ground pattern recognition

Journal of Intelligent & Fuzzy Systems ◽

10.3233/jifs-202940 ◽

2021 ◽

pp. 1-15

Author(s):

Jiankai Zuo ◽

Yaying Zhang

Keyword(s):

Nearest Neighbor ◽

Fitness Function ◽

Gradient Boosting ◽

Support Vector ◽

Evolutionary Learning ◽

Ensemble Classifiers ◽

Improved Genetic Algorithm ◽

K Nearest Neighbor ◽

Obvious Effect ◽

Extreme Gradient Boosting

In the field of intelligent robot engineering, whether it is humanoid, bionic or vehicle robots, the driving forms of standing, moving and walking, and the consciousness discrimination of the environment in which they are located have always been the focus and difficulty of research. Based on such problems, Naive Bayes Classifier (NBC), Support Vector Machine(SVM), k-Nearest-Neighbor (KNN), Decision Tree (DT), Random Forest (RF) and eXtreme Gradient Boosting (XGBoost) were introduced to conduct experiments. The six individual classifiers have an obvious effect on a particular type of ground, but the overall performance is poor. Therefore, the paper proposes a “Novel Hybrid Evolutionary Learning” method (NHEL) which combines every single classifier by means of weighted voting and adopts an improved genetic algorithm (GA) to obtain the optimal weight. According to the fitness function and evolution times, this paper designs the adaptively changing crossover and mutation rate and applies the conjugate gradient (CG) to enhance GA. By making full use of the global search capabilities of GA and the fast local search ability of CG, the convergence speed is accelerated and the search precision is upgraded. The experimental results show that the performance of the proposed model is significantly better than individual machine learning and ensemble classifiers.

Download Full-text

Evaluation of Three Different Machine Learning Methods for Object-Based Artificial Terrace Mapping—A Case Study of the Loess Plateau, China

Remote Sensing ◽

10.3390/rs13051021 ◽

2021 ◽

Vol 13 (5) ◽

pp. 1021

Author(s):

Hu Ding ◽

Jiaming Na ◽

Shangjing Jiang ◽

Jie Zhu ◽

Kai Liu ◽

...

Keyword(s):

Machine Learning ◽

Random Forest ◽

Loess Plateau ◽

Water Conservation ◽

Nearest Neighbor ◽

Gradient Boosting ◽

K Nearest Neighbor ◽

The Loess Plateau ◽

Object Based ◽

Extreme Gradient Boosting

Artificial terraces are of great importance for agricultural production and soil and water conservation. Automatic high-accuracy mapping of artificial terraces is the basis of monitoring and related studies. Previous research achieved artificial terrace mapping based on high-resolution digital elevation models (DEMs) or imagery. As a result of the importance of the contextual information for terrace mapping, object-based image analysis (OBIA) combined with machine learning (ML) technologies are widely used. However, the selection of an appropriate classifier is of great importance for the terrace mapping task. In this study, the performance of an integrated framework using OBIA and ML for terrace mapping was tested. A catchment, Zhifanggou, in the Loess Plateau, China, was used as the study area. First, optimized image segmentation was conducted. Then, features from the DEMs and imagery were extracted, and the correlations between the features were analyzed and ranked for classification. Finally, three different commonly-used ML classifiers, namely, extreme gradient boosting (XGBoost), random forest (RF), and k-nearest neighbor (KNN), were used for terrace mapping. The comparison with the ground truth, as delineated by field survey, indicated that random forest performed best, with a 95.60% overall accuracy (followed by 94.16% and 92.33% for XGBoost and KNN, respectively). The influence of class imbalance and feature selection is discussed. This work provides a credible framework for mapping artificial terraces.

Download Full-text

Computational Intelligence-Based Model for Mortality Rate Prediction in COVID-19 Patients

International Journal of Environmental Research and Public Health ◽

10.3390/ijerph18126429 ◽

2021 ◽

Vol 18 (12) ◽

pp. 6429

Author(s):

Irfan Ullah Khan ◽

Nida Aslam ◽

Malak Aljabri ◽

Sumayh S. Aljameel ◽

Mariam Moataz Aly Kamaleldin ◽

...

Keyword(s):

Mortality Rate ◽

Computational Intelligence ◽

Nearest Neighbor ◽

Gradient Boosting ◽

K Nearest Neighbor ◽

Detection And Identification ◽

Proposed Model ◽

Extreme Gradient Boosting ◽

The World ◽

Detection And Diagnosis

The COVID-19 outbreak is currently one of the biggest challenges facing countries around the world. Millions of people have lost their lives due to COVID-19. Therefore, the accurate early detection and identification of severe COVID-19 cases can reduce the mortality rate and the likelihood of further complications. Machine Learning (ML) and Deep Learning (DL) models have been shown to be effective in the detection and diagnosis of several diseases, including COVID-19. This study used ML algorithms, such as Decision Tree (DT), Logistic Regression (LR), Random Forest (RF), Extreme Gradient Boosting (XGBoost), and K-Nearest Neighbor (KNN) and DL model (containing six layers with ReLU and output layer with sigmoid activation), to predict the mortality rate in COVID-19 cases. Models were trained using confirmed COVID-19 patients from 146 countries. Comparative analysis was performed among ML and DL models using a reduced feature set. The best results were achieved using the proposed DL model, with an accuracy of 0.97. Experimental results reveal the significance of the proposed model over the baseline study in the literature with the reduced feature set.

Download Full-text

mAHTPred: a sequence-based meta-predictor for improving the prediction of anti-hypertensive peptides using effective feature representation

Bioinformatics ◽

10.1093/bioinformatics/bty1047 ◽

2018 ◽

Vol 35 (16) ◽

pp. 2757-2765 ◽

Cited By ~ 63

Author(s):

Balachandran Manavalan ◽

Shaherin Basith ◽

Tae Hwan Shin ◽

Leyi Wei ◽

Gwang Lee

Keyword(s):

Nearest Neighbor ◽

Feature Representation ◽

Superior Performance ◽

Supplementary Information ◽

Gradient Boosting ◽

Support Vector ◽

Pharmaceutical Drugs ◽

K Nearest Neighbor ◽

Feature Descriptors ◽

Predicted Probability

AbstractMotivationCardiovascular disease is the primary cause of death globally accounting for approximately 17.7 million deaths per year. One of the stakes linked with cardiovascular diseases and other complications is hypertension. Naturally derived bioactive peptides with antihypertensive activities serve as promising alternatives to pharmaceutical drugs. So far, there is no comprehensive analysis, assessment of diverse features and implementation of various machine-learning (ML) algorithms applied for antihypertensive peptide (AHTP) model construction.ResultsIn this study, we utilized six different ML algorithms, namely, Adaboost, extremely randomized tree (ERT), gradient boosting (GB), k-nearest neighbor, random forest (RF) and support vector machine (SVM) using 51 feature descriptors derived from eight different feature encodings for the prediction of AHTPs. While ERT-based trained models performed consistently better than other algorithms regardless of various feature descriptors, we treated them as baseline predictors, whose predicted probability of AHTPs was further used as input features separately for four different ML-algorithms (ERT, GB, RF and SVM) and developed their corresponding meta-predictors using a two-step feature selection protocol. Subsequently, the integration of four meta-predictors through an ensemble learning approach improved the balanced prediction performance and model robustness on the independent dataset. Upon comparison with existing methods, mAHTPred showed superior performance with an overall improvement of approximately 6–7% in both benchmarking and independent datasets.Availability and implementationThe user-friendly online prediction tool, mAHTPred is freely accessible at http://thegleelab.org/mAHTPred.Supplementary informationSupplementary data are available at Bioinformatics online.

Download Full-text