Improvement of Machine Learning Models’ Performances based on Ensemble Learning for the detection of Alzheimer Disease

X.509 certificates play an important role in encrypting the transmission of data on both sides under HTTPS. With the popularization of X.509 certificates, more and more criminals leverage certificates to prevent their communications from being exposed by malicious traffic analysis tools. Phishing sites and malware are good examples. Those X.509 certificates found in phishing sites or malware are called malicious X.509 certificates. This paper applies different machine learning models, including classical machine learning models, ensemble learning models, and deep learning models, to distinguish between malicious certificates and benign certificates with Verification for Extraction (VFE). The VFE is a system we design and implement for obtaining plentiful characteristics of certificates. The result shows that ensemble learning models are the most stable and efficient models with an average accuracy of 95.9%, which outperforms many previous works. In addition, we obtain an SVM-based detection model with an accuracy of 98.2%, which is the highest accuracy. The outcome indicates the VFE is capable of capturing essential and crucial characteristics of malicious X.509 certificates.

Download Full-text

A Survey of Different Machine Learning Models for Alzheimer Disease Prediction

International Journal of Emerging Trends in Engineering Research ◽

10.30534/ijeter/2020/73872020 ◽

2020 ◽

Vol 8 (7) ◽

pp. 3328-3337

Author(s):

Ragavamsi Davuluri

Keyword(s):

Machine Learning ◽

Alzheimer Disease ◽

Disease Prediction ◽

Learning Models ◽

Machine Learning Models

Download Full-text

A Comparative Analysis of Novel Deep Learning and Ensemble Learning Models to Predict the Allergenicity of Food Proteins

Foods ◽

10.3390/foods10040809 ◽

2021 ◽

Vol 10 (4) ◽

pp. 809

Author(s):

Liyang Wang ◽

Dantong Niu ◽

Xinjie Zhao ◽

Xiaoya Wang ◽

Mengzhen Hao ◽

...

Keyword(s):

Machine Learning ◽

Deep Learning ◽

Ensemble Learning ◽

Food Allergen ◽

Gradient Boosting ◽

Learning Models ◽

Food Proteins ◽

Deep Model ◽

Allergen Identification ◽

Machine Learning Models

Traditional food allergen identification mainly relies on in vivo and in vitro experiments, which often needs a long period and high cost. The artificial intelligence (AI)-driven rapid food allergen identification method has solved the above mentioned some drawbacks and is becoming an efficient auxiliary tool. Aiming to overcome the limitations of lower accuracy of traditional machine learning models in predicting the allergenicity of food proteins, this work proposed to introduce deep learning model—transformer with self-attention mechanism, ensemble learning models (representative as Light Gradient Boosting Machine (LightGBM) eXtreme Gradient Boosting (XGBoost)) to solve the problem. In order to highlight the superiority of the proposed novel method, the study also selected various commonly used machine learning models as the baseline classifiers. The results of 5-fold cross-validation showed that the area under the receiver operating characteristic curve (AUC) of the deep model was the highest (0.9578), which was better than the ensemble learning and baseline algorithms. But the deep model need to be pre-trained, and the training time is the longest. By comparing the characteristics of the transformer model and boosting models, it can be analyzed that, each model has its own advantage, which provides novel clues and inspiration for the rapid prediction of food allergens in the future.

Download Full-text

An Adaptive Deep Ensemble Learning Method for Dynamic Evolving Diagnostic Task Scenarios

Diagnostics ◽

10.3390/diagnostics11122288 ◽

2021 ◽

Vol 11 (12) ◽

pp. 2288

Author(s):

Kaixiang Su ◽

Jiao Wu ◽

Dongxiao Gu ◽

Shanlin Yang ◽

Shuyuan Deng ◽

...

Keyword(s):

Machine Learning ◽

Ensemble Learning ◽

Model Performance ◽

Optimal Number ◽

Training Data ◽

Learning Method ◽

Learning Models ◽

Proposed Model ◽

Public Datasets ◽

Machine Learning Models

Increasingly, machine learning methods have been applied to aid in diagnosis with good results. However, some complex models can confuse physicians because they are difficult to understand, while data differences across diagnostic tasks and institutions can cause model performance fluctuations. To address this challenge, we combined the Deep Ensemble Model (DEM) and tree-structured Parzen Estimator (TPE) and proposed an adaptive deep ensemble learning method (TPE-DEM) for dynamic evolving diagnostic task scenarios. Different from previous research that focuses on achieving better performance with a fixed structure model, our proposed model uses TPE to efficiently aggregate simple models more easily understood by physicians and require less training data. In addition, our proposed model can choose the optimal number of layers for the model and the type and number of basic learners to achieve the best performance in different diagnostic task scenarios based on the data distribution and characteristics of the current diagnostic task. We tested our model on one dataset constructed with a partner hospital and five UCI public datasets with different characteristics and volumes based on various diagnostic tasks. Our performance evaluation results show that our proposed model outperforms other baseline models on different datasets. Our study provides a novel approach for simple and understandable machine learning models in tasks with variable datasets and feature sets, and the findings have important implications for the application of machine learning models in computer-aided diagnosis.

Download Full-text

Combining Machine Learning Models Using combo Library

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v34i09.7111 ◽

2020 ◽

Vol 34 (09) ◽

pp. 13648-13649

Author(s):

Yue Zhao ◽

Xuejian Wang ◽

Cheng Cheng ◽

Xueying Ding

Keyword(s):

Machine Learning ◽

Anomaly Detection ◽

Ensemble Learning ◽

Academic Research ◽

Learning Models ◽

Model Combination ◽

Code Coverage ◽

Industry Applications ◽

Python Package ◽

Machine Learning Models

Model combination, often regarded as a key sub-field of ensemble learning, has been widely used in both academic research and industry applications. To facilitate this process, we propose and implement an easy-to-use Python toolkit, combo, to aggregate models and scores under various scenarios, including classification, clustering, and anomaly detection. In a nutshell, combo provides a unified and consistent way to combine both raw and pretrained models from popular machine learning libraries, e.g., scikit-learn, XGBoost, and LightGBM. With accessibility and robustness in mind, combo is designed with detailed documentation, interactive examples, continuous integration, code coverage, and maintainability check; it can be installed easily through Python Package Index (PyPI) or {https://github.com/yzhao062/combo}.

Download Full-text

A Novel Ensemble Model on Defects Identification in Aero-Engine Blade

Processes ◽

10.3390/pr9060992 ◽

2021 ◽

Vol 9 (6) ◽

pp. 992

Author(s):

Yingkui Jiao ◽

Zhiwei Li ◽

Junchao Zhu ◽

Bin Xue ◽

Baofeng Zhang

Keyword(s):

Machine Learning ◽

Ensemble Learning ◽

Learning Models ◽

Echo Signal ◽

Defect Identification ◽

Learning Classifiers ◽

Ultrasonic Echo ◽

Aero Engine ◽

Engine Blade ◽

Machine Learning Models

Machine learning-based defect identification has emerged as a promising solution to improving the defect accuracy of the aero-engine blade. This solution adopts machine learning classifiers to classify the types of defects. These classifiers are trained to use features collected in ultrasonic echo signals. However, the current studies show the potential number of features, such as statistic values, for identifying defect reaches a number more than that offered by an ultrasonic echo signal. This necessitates multiple acquisitions of echo signal and increases manual effort, and the feature obtained from feature selection is sensitive to the characteristic of the classifier, which further increases the uncertainty of the classifier result. This paper proposes an ensemble learning technique that is only based on few features obtained from an echo signal and still achieves a high accuracy of defect identification as that in traditional machine learning, eliminating the need for multiple acquisitions of the echo signal. To this end, we apply two well-known ensemble learning classifiers and simultaneously compare three widely used machine learning models on defect identification of blades. The result shows that the proposed ensemble learning models outperform machine learning-based models with an equal number of features. In addition, the two-feature-based ensemble learning model reaches an accuracy close to that of multiple statistic features-based machine learning models, where features are obtained from multiple collections of the signal.

Download Full-text

Supervised ensemble learning methods towards automatically filtering Urdu fake news within social media

PeerJ Computer Science ◽

10.7717/peerj-cs.425 ◽

2021 ◽

Vol 7 ◽

pp. e425

Author(s):

Muhammad Pervez Akhter ◽

Jiangbin Zheng ◽

Farkhanda Afzal ◽

Hui Lin ◽

Saleem Riaz ◽

...

Keyword(s):

Machine Learning ◽

Ensemble Learning ◽

English Language ◽

Performance Metrics ◽

Area Under The Curve ◽

Absolute Error ◽

Fake News ◽

Learning Models ◽

Learning Methods ◽

Machine Learning Models

The popularity of the internet, smartphones, and social networks has contributed to the proliferation of misleading information like fake news and fake reviews on news blogs, online newspapers, and e-commerce applications. Fake news has a worldwide impact and potential to change political scenarios, deceive people into increasing product sales, defaming politicians or celebrities, and misguiding visitors to stop visiting a place or country. Therefore, it is vital to find automatic methods to detect fake news online. In several past studies, the focus was the English language, but the resource-poor languages have been completely ignored because of the scarcity of labeled corpus. In this study, we investigate this issue in the Urdu language. Our contribution is threefold. First, we design an annotated corpus of Urdu news articles for the fake news detection tasks. Second, we explore three individual machine learning models to detect fake news. Third, we use five ensemble learning methods to ensemble the base-predictors’ predictions to improve the fake news detection system’s overall performance. Our experiment results on two Urdu news corpora show the superiority of ensemble models over individual machine learning models. Three performance metrics balanced accuracy, the area under the curve, and mean absolute error used to find that Ensemble Selection and Vote models outperform the other machine learning and ensemble learning models.

Download Full-text

A Comparative Analysis of Novel Deep Learning Models and Ensemble Learning Models to Predict the Allergenicity of Food Allergens

10.1101/2021.03.10.434710 ◽

2021 ◽

Author(s):

Liyang Wang ◽

Dantong Niu ◽

Xinjie Zhao ◽

Xiaoya Wang ◽

Mengzhen Hao ◽

...

Keyword(s):

Machine Learning ◽

Deep Learning ◽

Ensemble Learning ◽

Food Allergen ◽

Learning Model ◽

Gradient Boosting ◽

Food Allergens ◽

Learning Models ◽

Allergen Identification ◽

Machine Learning Models

AbstractTraditional food allergen identification mainly relies on in vivo and in vitro experiments, which often needs a long period and high cost. The artificial intelligence (AI)-driven rapid food allergen identification method has solved the two drawbacks and is becoming an efficient auxiliary tool. Aiming to overcome the limitations of lower accuracy of traditional machine learning models in predicting the allergenicity of food allergens, this work proposed to introduce transformer deep learning model with self-attention mechanism and ensemble learning model (representative as Light Gradient Boosting Machine (LightGBM) and eXtreme Gradient Boosting (XGBoost)) to solve the problem. In order to highlight the superiority of the proposed novel method, the study also selected various commonly used machine learning models as the baseline classifiers. The results of 5-fold cross-validation found that the AUC of the deep model was the highest (0.9400), which was better than the ensemble learning and baseline algorithms. But it needed to be pre-trained, and the training cost was highest. By comparing the characteristics of transformer model and boosting models, it can be analyzed that the two types of models have their own advantages, which provides novel clues and inspiration for the rapid prediction of food allergens in the future.

Download Full-text

The mathematics of erythema: Development of machine learning models for artificial intelligence assisted measurement and severity scoring of radiation induced dermatitis

10.1101/2021.09.24.21264011 ◽

2021 ◽

Author(s):

Rahul Ranjan ◽

Richard Partl ◽

Ricarda Erhart ◽

Nithin Kurup ◽

Harald Schnidar

Keyword(s):

Artificial Intelligence ◽

Machine Learning ◽

Ensemble Learning ◽

Sensitivity And Specificity ◽

Test Accuracy ◽

Learning Models ◽

Severity Grade ◽

Radiation Induced ◽

Machine Learning Models

Although significant advancements in computer-aided diagnostics using artificial intelligence (AI) have been made, to date, no viable method for radiation-induced skin reaction (RISR) analysis and classification is available. The objective of this single-center study was to develop machine learning and deep learning approaches using deep convolutional neural networks (CNNs) for automatic classification of RISRs according to the Common Terminology Criteria for Adverse Events (CTCAE) grading system. ScarletredⓇ Vision, a novel and state-of-the-art digital skin imaging method capable of remote monitoring and objective assessment of acute RISRs was used to convert 2D digital skin images using the CIELAB color space and conduct SEV* measurements. A set of different machine learning and deep convolutional neural network-based algorithms has been explored for the automatic classification of RISRs. A total of 2263 distinct images from 209 patients were analyzed for training and testing the machine learning and CNN algorithms. For a 2-class problem of healthy skin (grade 0) versus erythema (grade ≥ 1), all machine learning models produced an accuracy of above 70%, and the sensitivity and specificity of erythema recognition were 67-72% and 72-83%, respectively. The CNN produced a test accuracy of 74%, sensitivity of 66%, and specificity of 83% for predicting healthy and erythema cases. For the severity grade prediction of a 3-class problem (grade 0 versus 1 versus 2), the test accuracy was 60-67%, and the sensitivity and specificity were 56-82%, 35-59%, and 65-72%, respectively. For estimating the severity grade of each class, the CNN obtained an accuracy of 73%, 66%, and 82%, respectively. Ensemble learning combines several individual predictions to obtain a better generalization performance. Furthermore, we exploited ensemble learning by deploying a CNN model as a meta-learner. The ensemble CNN based on bagging and majority voting shows an accuracy, sensitivity and specificity of 87%, 90%, and 82% for a 2-class problem, respectively. For a 3-class problem, the ensemble CNN shows an overall accuracy of 66%, while for each grade (0, 1, and 2) accuracies were 0.76%, 0.69%, and 0.87%, sensitivities were 0.70%, 0.57%, and 0.71%, and specificities were 0.78%, 0.75%, and 0.95%, respectively. This study is the first to focus on erythema in radiation-dermatitis and produces benchmark results using machine learning models. The outcome of this study validates that the proposed system can act as a pre-screening and decision support tool for oncologists or patients to provide fast, reliable, and efficient assessment of erythema grading.

Download Full-text

Comparing machine learning and ensemble learning in the field of football

International Journal of Electrical and Computer Engineering (IJECE) ◽

10.11591/ijece.v9i5.pp4321-4325 ◽

2019 ◽

Vol 9 (5) ◽

pp. 4321

Author(s):

Shuaib Khan ◽

Kirubanand V. B

Keyword(s):

Machine Learning ◽

Ensemble Learning ◽

Learning Model ◽

Support Vector ◽

Learning Models ◽

Machine Learning Model ◽

Vector Machines ◽

The Right ◽

Enormous Number ◽

Machine Learning Models

Football has been one of the most popular and loved sports since its birth on November 6th, 1869. The main reason for this is because it is highly unpredictable in nature. Predicting football matches results seems like the perfect problem for machine learning models. But there are various caveats such as picking the right features from an enormous number of available features. There have been many models which have been applied to various football-related datasets. This paper aims to compare Support Vector Machines a machine learning model and XGBoost an Ensemble learning model and how Ensemble Learning can greatly improve the accuracy of the predictions.

Download Full-text