scholarly journals Multimodal Early Alzheimer’s Detection, a Genetic Algorithm Approach with Support Vector Machines

Healthcare ◽  
2021 ◽  
Vol 9 (8) ◽  
pp. 971
Author(s):  
Ana Sánchez-Reyna ◽  
José Celaya-Padilla ◽  
Carlos Galván-Tejada ◽  
Huizilopoztli Luna-García ◽  
Hamurabi Gamboa-Rosales ◽  
...  

Alzheimer’s disease (AD) is a neurodegenerative disease that mainly affects older adults. Currently, AD is associated with certain hypometabolic biomarkers, beta-amyloid peptides, hyperphosphorylated tau protein, and changes in brain morphology. Accurate diagnosis of AD, as well as mild cognitive impairment (MCI) (prodromal stage of AD), is essential for early care of the disease. As a result, machine learning techniques have been used in recent years for the diagnosis of AD. In this research, we propose a novel methodology to generate a multivariate model that combines different types of features for the detection of AD. In order to obtain a robust biomarker, ADNI baseline data, clinical and neuropsychological assessments (1024 features) of 106 patients were used. The data were normalized, and a genetic algorithm was implemented for the selection of the most significant features. Subsequently, for the development and validation of the multivariate classification model, a support vector machine model was created, and a five-fold cross-validation with an AUC of 87.63% was used to measure model performance. Lastly, an independent blind test of our final model, using 20 patients not considered during the model construction, yielded an AUC of 100%.

2018 ◽  
Vol 141 (4) ◽  
Author(s):  
Qihong Feng ◽  
Ronghao Cui ◽  
Sen Wang ◽  
Jin Zhang ◽  
Zhe Jiang

Diffusion coefficient of carbon dioxide (CO2), a significant parameter describing the mass transfer process, exerts a profound influence on the safety of CO2 storage in depleted reservoirs, saline aquifers, and marine ecosystems. However, experimental determination of diffusion coefficient in CO2-brine system is time-consuming and complex because the procedure requires sophisticated laboratory equipment and reasonable interpretation methods. To facilitate the acquisition of more accurate values, an intelligent model, termed MKSVM-GA, is developed using a hybrid technique of support vector machine (SVM), mixed kernels (MK), and genetic algorithm (GA). Confirmed by the statistical evaluation indicators, our proposed model exhibits excellent performance with high accuracy and strong robustness in a wide range of temperatures (273–473.15 K), pressures (0.1–49.3 MPa), and viscosities (0.139–1.950 mPa·s). Our results show that the proposed model is more applicable than the artificial neural network (ANN) model at this sample size, which is superior to four commonly used traditional empirical correlations. The technique presented in this study can provide a fast and precise prediction of CO2 diffusivity in brine at reservoir conditions for the engineering design and the technical risk assessment during the process of CO2 injection.


2021 ◽  
Vol 7 (2) ◽  
pp. 203-206
Author(s):  
Herag Arabian ◽  
Verena Wagner-Hartl ◽  
Knut Moeller

Abstract Facial emotion recognition (FER) is a topic that has gained interest over the years for its role in bridging the gap between Human and Machine interactions. This study explores the potential of real time FER modelling, to be integrated in a closed loop system, to help in treatment of children suffering from Autism Spectrum Disorder (ASD). The aim of this study is to show the differences between implementing Traditional machine learning and Deep learning approaches for FER modelling. Two classification approaches were taken, the first approach was based on classic machine learning techniques using Histogram of Oriented Gradients (HOG) for feature extraction, with a k-Nearest Neighbor and a Support Vector Machine model as classifiers. The second approach uses Transfer Learning based on the popular “Alex Net” Neural Network architecture. The performance of the approaches was based on the accuracy of randomly selected validation sets after training on random training sets of the Oulu-CASIA database. The data analyzed shows that traditional machine learning methods are as effective as deep neural net models and are a good compromise between accuracy, extracted features, computational speed and costs.


Author(s):  
Alok Kumar Shukla ◽  
Pradeep Singh ◽  
Manu Vardhan

The explosion of the high-dimensional dataset in the scientific repository has been encouraging interdisciplinary research on data mining, pattern recognition and bioinformatics. The fundamental problem of the individual Feature Selection (FS) method is extracting informative features for classification model and to seek for the malignant disease at low computational cost. In addition, existing FS approaches overlook the fact that for a given cardinality, there can be several subsets with similar information. This paper introduces a novel hybrid FS algorithm, called Filter-Wrapper Feature Selection (FWFS) for a classification problem and also addresses the limitations of existing methods. In the proposed model, the front-end filter ranking method as Conditional Mutual Information Maximization (CMIM) selects the high ranked feature subset while the succeeding method as Binary Genetic Algorithm (BGA) accelerates the search in identifying the significant feature subsets. One of the merits of the proposed method is that, unlike an exhaustive method, it speeds up the FS procedure without lancing of classification accuracy on reduced dataset when a learning model is applied to the selected subsets of features. The efficacy of the proposed (FWFS) method is examined by Naive Bayes (NB) classifier which works as a fitness function. The effectiveness of the selected feature subset is evaluated using numerous classifiers on five biological datasets and five UCI datasets of a varied dimensionality and number of instances. The experimental results emphasize that the proposed method provides additional support to the significant reduction of the features and outperforms the existing methods. For microarray data-sets, we found the lowest classification accuracy is 61.24% on SRBCT dataset and highest accuracy is 99.32% on Diffuse large B-cell lymphoma (DLBCL). In UCI datasets, the lowest classification accuracy is 40.04% on the Lymphography using k-nearest neighbor (k-NN) and highest classification accuracy is 99.05% on the ionosphere using support vector machine (SVM).


Molecules ◽  
2020 ◽  
Vol 25 (6) ◽  
pp. 1442 ◽  
Author(s):  
Tao Shen ◽  
Hong Yu ◽  
Yuan-Zhong Wang

Gentiana, which is one of the largest genera of Gentianoideae, most of which had potential pharmaceutical value, and applied to local traditional medical treatment. Because of the phytochemical diversity and difference of bioactive compounds among species, which makes it crucial to accurately identify authentic Gentiana species. In this paper, the feasibility of using the infrared spectroscopy technique combined with chemometrics analysis to identify Gentiana and its related species was studied. A total of 180 batches of raw spectral fingerprints were obtained from 18 species of Gentiana and Tripterospermum by near-infrared (NIR: 10,000–4000 cm−1) and Fourier transform mid-infrared (MIR: 4000–600 cm−1) spectrum. Firstly, principal component analysis (PCA) was utilized to explore the natural grouping of the 180 samples. Secondly, random forests (RF), support vector machine (SVM), and K-nearest neighbors (KNN) models were built while using full spectra (including 1487 NIR variables and 1214 FT-MIR variables, respectively). The MIR-SVM model had a higher classification accuracy rate than the other models that were based on the results of the calibration sets and prediction sets. The five feature selection strategies, VIP (variable importance in the projection), Boruta, GARF (genetic algorithm combined with random forest), GASVM (genetic algorithm combined with support vector machine), and Venn diagram calculation, were used to reduce the dimensions of the data variable in order to further reduce numbers of variables for modeling. Finally, 101 NIR and 73 FT-MIR bands were selected as the feature variables, respectively. Thirdly, stacking models were built based on the optimal spectral dataset. Most of the stacking models performed better than the full spectra-based models. RF and SVM (as base learners), combined with the SVM meta-classifier, was the optimal stacked generalization strategy. For the SG-Ven-MIR-SVM model, the accuracy (ACC) of the calibration set and validation set were both 100%. Sensitivity (SE), specificity (SP), efficiency (EFF), Matthews correlation coefficient (MCC), and Cohen’s kappa coefficient (K) were all 1, which showed that the model had the optimal authenticity identification performance. Those parameters indicated that stacked generalization combined with feature selection is probably an important technique for improving the classification model predictive accuracy and avoid overfitting. The study result can provide a valuable reference for the safety and effectiveness of the clinical application of medicinal Gentiana.


2020 ◽  
Vol 24 (5) ◽  
pp. 1141-1160
Author(s):  
Tomás Alegre Sepúlveda ◽  
Brian Keith Norambuena

In this paper, we apply sentiment analysis methods in the context of the first round of the 2017 Chilean elections. The purpose of this work is to estimate the voting intention associated with each candidate in order to contrast this with the results from classical methods (e.g., polls and surveys). The data are collected from Twitter, because of its high usage in Chile and in the sentiment analysis literature. We obtained tweets associated with the three main candidates: Sebastián Piñera (SP), Alejandro Guillier (AG) and Beatriz Sánchez (BS). For each candidate, we estimated the voting intention and compared it to the traditional methods. To do this, we first acquired the data and labeled the tweets as positive or negative. Afterward, we built a model using machine learning techniques. The classification model had an accuracy of 76.45% using support vector machines, which yielded the best model for our case. Finally, we use a formula to estimate the voting intention from the number of positive and negative tweets for each candidate. For the last period, we obtained a voting intention of 35.84% for SP, compared to a range of 34–44% according to traditional polls and 36% in the actual elections. For AG we obtained an estimate of 37%, compared with a range of 15.40% to 30.00% for traditional polls and 20.27% in the elections. For BS we obtained an estimate of 27.77%, compared with the range of 8.50% to 11.00% given by traditional polls and an actual result of 22.70% in the elections. These results are promising, in some cases providing an estimate closer to reality than traditional polls. Some differences can be explained due to the fact that some candidates have been omitted, even though they held a significant number of votes.


Geosciences ◽  
2021 ◽  
Vol 11 (7) ◽  
pp. 265
Author(s):  
Stefan Rauter ◽  
Franz Tschuchnigg

The classification of soils into categories with a similar range of properties is a fundamental geotechnical engineering procedure. At present, this classification is based on various types of cost- and time-intensive laboratory and/or in situ tests. These soil investigations are essential for each individual construction site and have to be performed prior to the design of a project. Since Machine Learning could play a key role in reducing the costs and time needed for a suitable site investigation program, the basic ability of Machine Learning models to classify soils from Cone Penetration Tests (CPT) is evaluated. To find an appropriate classification model, 24 different Machine Learning models, based on three different algorithms, are built and trained on a dataset consisting of 1339 CPT. The applied algorithms are a Support Vector Machine, an Artificial Neural Network and a Random Forest. As input features, different combinations of direct cone penetration test data (tip resistance qc, sleeve friction fs, friction ratio Rf, depth d), combined with “defined”, thus, not directly measured data (total vertical stresses σv, effective vertical stresses σ’v and hydrostatic pore pressure u0), are used. Standard soil classes based on grain size distributions and soil classes based on soil behavior types according to Robertson are applied as targets. The different models are compared with respect to their prediction performance and the required learning time. The best results for all targets were obtained with models using a Random Forest classifier. For the soil classes based on grain size distribution, an accuracy of about 75%, and for soil classes according to Robertson, an accuracy of about 97–99%, was reached.


2014 ◽  
Vol 989-994 ◽  
pp. 1873-1876
Author(s):  
Yu Zhen Xie ◽  
Zhao Gang Wang ◽  
Xiao Wei Dai

In order to obtain more accurate parameters of support vector machine model, using genetic algorithm to optimize the parameters is an effective method. This paper analyzes the principle of support vector machine for regression, support vector machine kernel function selection, kernel parameters, penalty factor selection and adjustment methods, taking into account genetic algorithm is effective in solving optimization problems, proposed a method using genetic algorithm to optimize the parameters of support vector machine, which uses genetic algorithms to make cross-validation error minimized. The simulation results demonstrate the effectiveness of this method.


2020 ◽  
Vol 13 (1-2) ◽  
pp. 43-52
Author(s):  
Boudewijn van Leeuwen ◽  
Zalán Tobak ◽  
Ferenc Kovács

AbstractClassification of multispectral optical satellite data using machine learning techniques to derive land use/land cover thematic data is important for many applications. Comparing the latest algorithms, our research aims to determine the best option to classify land use/land cover with special focus on temporary inundated land in a flat area in the south of Hungary. These inundations disrupt agricultural practices and can cause large financial loss. Sentinel 2 data with a high temporal and medium spatial resolution is classified using open source implementations of a random forest, support vector machine and an artificial neural network. Each classification model is applied to the same data set and the results are compared qualitatively and quantitatively. The accuracy of the results is high for all methods and does not show large overall differences. A quantitative spatial comparison demonstrates that the neural network gives the best results, but that all models are strongly influenced by atmospheric disturbances in the image.


Sign in / Sign up

Export Citation Format

Share Document