Domain adaptive boosting method and its applications

Abstract In this work, a new approach to weather model output postprocessing is presented. The adaptive boosting algorithm is used to train a set of simple base classifiers with historical data from weather model output, surface synoptic observation (SYNOP) messages, and lightning data. The resulting overall method then can be used to classify weather model output to identify potential thunderstorms. The method generates a certainty measure between −1 and 1, describing how likely a thunderstorm is to occur. Using a threshold, the measure can be converted to a binary decision. When compared to a linear discriminant and a method currently employed in an expert system from the German Weather Service, boosting achieves the best validation scores. A substantial improvement of the probability of detection of up to 72% and a decrease of the false alarm rate down to 34% can be achieved for the identification of thunderstorms in model analysis. Independent of the verification results, the method has several useful properties: good cross-validation results, short learning time (≤10 min sequential run time for the experiments on a standard PC), comprehensible inner values of the underlying statistical analysis, and the simplicity of adding predictors to a running system. This paper concludes with a set of possible other applications and extensions to the presented example of thunderstorm detection.

Download Full-text

IDENTIFICATION OF AREAS OF CORONAVIRUS COVID-19 INCIDENCE SPREADING BASED ON CLUSTER ANALYSIS METHOD

Innovative technologies and scientific solutions for industries ◽

10.30837/itssi.2021.15.005 ◽

2021 ◽

pp. 5-13

Author(s):

Kseniia Bazilevych ◽

Ievgen Meniailov ◽

Dmytro Chumachenko

Keyword(s):

Neural Network ◽

Neural Networks ◽

Cluster Analysis ◽

Data Analysis ◽

Gradient Descent ◽

Descent Method ◽

Gradient Descent Method ◽

Adaptive Boosting ◽

Software Product ◽

Boosting Method

Subject: the use of the mathematical apparatus of neural networks for the scientific substantiation of anti-epidemic measures in order to reduce the incidence of diseases when making effective management decisions. Purpose: to apply cluster analysis, based on a neural network, to solve the problem of identifying areas of incidence. Tasks: to analyze methods of data analysis to solve the clustering problem; to develop a neural network method for clustering the territory of Ukraine according to the nature of the epidemic process COVID-19; on the basis of the developed method, to implement a data analysis software product to identify the areas of incidence of the disease using the example of the coronavirus COVID-19. Methods: models and methods of data analysis, models and methods of systems theory (based on the information approach), machine learning methods, in particular the Adaptive Boosting method (based on the gradient descent method), methods for training neural networks. Results: we used the data of the Center for Public Health of the Ministry of Health of Ukraine distributed over the regions of Ukraine on the incidence of COVID-19, the number of laboratory examined persons, the number of laboratory tests performed by PCR and ELISA methods, the number of laboratory tests of IgA, IgM, IgG; the model used data from March 2020 to December 2020, the modeling did not take into account data from the temporarily occupied territories of Ukraine; for cluster analysis, a neural network of 60 input neurons, 100 hidden neurons with an activation Fermi function and 4 output neurons was built; for the software implementation of the model, the programming language Python was used. Conclusions: analysis of methods for constructing neural networks; analysis of training methods for neural networks, including the use of the gradient descent method for the Adaptive Boosting method; all theoretical information described in this work was used to implement a software product for processing test data for COVID-19 in Ukraine; the division of the regions of Ukraine into zones of infection with the COVID-19 virus was carried out and a map of this division was presented.

Download Full-text

Breast Cancer Biomarker Prediction Model Based on Principal Component Extraction and Deep Convolutional Network Integration Learning

E3S Web of Conferences ◽

10.1051/e3sconf/202018504028 ◽

2020 ◽

Vol 185 ◽

pp. 04028

Author(s):

Kun Ruan ◽

Yuhao Peng ◽

Yuhan Kang ◽

Shun Zhao ◽

Tanke Wang ◽

...

Keyword(s):

Breast Cancer ◽

Cancer Patients ◽

Expression Profiles ◽

Principal Component ◽

Copy Number Variations ◽

Sequencing Data ◽

Convolutional Network ◽

Adaptive Boosting ◽

Boosting Method ◽

Component Extraction

Effective extraction of characteristic information from sequencing data of cancer patients is an essential application for cancer research. Several prognostic classification models for breast cancer sequencing data have been established to assist patients in their treatment. However, these models still have problems such as poor robustness and low precision. Based on the convolutional network model in deep learning, we construct a new classifier PCA-1D LeNet-Ada (PLA) by using principal component extraction method, Le-Net convolution network, and Adaptive Boosting method. PLA predicts three biomarkers for breast cancer patients based on their somatic cell copy number variations and gene expression profiles.

Download Full-text

Classification of Skin Sensitizers on the Basis of Their Effective Concentration 3 Values by Using Adaptive Boosting Method

International Journal of Digital Content Technology and its Applications ◽

10.4156/jdcta.vol4.issue2.13 ◽

2010 ◽

Vol 4 (2) ◽

pp. 109-121

Author(s):

Zhengjun Cheng ◽

Yuntao Zhang ◽

Changhong Zhou ◽

Wenjun Zhang ◽

Shibo Gao

Keyword(s):

Effective Concentration ◽

Adaptive Boosting ◽

Boosting Method

Download Full-text

SCLAP: An Adaptive Boosting Method for Predicting Subchloroplast Localization of Plant Proteins

OMICS A Journal of Integrative Biology ◽

10.1089/omi.2012.0070 ◽

2013 ◽

Vol 17 (2) ◽

pp. 106-115 ◽

Cited By ~ 15

Author(s):

Vijayakumar Saravanan ◽

P.T.V. Lakshmi

Keyword(s):

Plant Proteins ◽

Adaptive Boosting ◽

Boosting Method

Download Full-text

GRADIENT BOOSTING METHOD APPLICATION TO SUPPORT PROCESS DECISIONS IN THE ELECTRON-BEAM WELDING PROCESS

Siberian Journal of Science and Technology ◽

10.31772/2587-6066-2020-21-2-206-214 ◽

2020 ◽

Vol 21 (2) ◽

pp. 206-214

Author(s):

V. S. Tynchenko ◽

◽

I. A. Golovenok ◽

V. E. Petrenko ◽

A. V. Milov ◽

...

Keyword(s):

Electron Beam ◽

Electron Beam Welding ◽

Welding Process ◽

Gradient Boosting ◽

Boosting Method

Download Full-text

Plant Recognition Using Morphological Feature Extraction and Transfer Learning over SVM and AdaBoost

Symmetry ◽

10.3390/sym13020356 ◽

2021 ◽

Vol 13 (2) ◽

pp. 356

Author(s):

Shubham Mahajan ◽

Akshay Raina ◽

Xiao-Zhi Gao ◽

Amit Kant Pandit

Keyword(s):

Feature Extraction ◽

Transfer Learning ◽

Plant Species ◽

Species Recognition ◽

Morphological Features ◽

Support Vector ◽

Additional Advantage ◽

Adaptive Boosting ◽

Vast Number ◽

Axis Length

Plant species recognition from visual data has always been a challenging task for Artificial Intelligence (AI) researchers, due to a number of complications in the task, such as the enormous data to be processed due to vast number of floral species. There are many sources from a plant that can be used as feature aspects for an AI-based model, but features related to parts like leaves are considered as more significant for the task, primarily due to easy accessibility, than other parts like flowers, stems, etc. With this notion, we propose a plant species recognition model based on morphological features extracted from corresponding leaves’ images using the support vector machine (SVM) with adaptive boosting technique. This proposed framework includes the pre-processing, extraction of features and classification into one of the species. Various morphological features like centroid, major axis length, minor axis length, solidity, perimeter, and orientation are extracted from the digital images of various categories of leaves. In addition to this, transfer learning, as suggested by some previous studies, has also been used in the feature extraction process. Various classifiers like the kNN, decision trees, and multilayer perceptron (with and without AdaBoost) are employed on the opensource dataset, FLAVIA, to certify our study in its robustness, in contrast to other classifier frameworks. With this, our study also signifies the additional advantage of 10-fold cross validation over other dataset partitioning strategies, thereby achieving a precision rate of 95.85%.

Download Full-text

A Machine Learning Method for Predicting Vegetation Indices in China

Remote Sensing ◽

10.3390/rs13061147 ◽

2021 ◽

Vol 13 (6) ◽

pp. 1147

Author(s):

Xiangqian Li ◽

Wenping Yuan ◽

Wenjie Dong

Keyword(s):

Machine Learning ◽

Growing Season ◽

Crop Growth ◽

Spatiotemporal Distribution ◽

Coefficient Of Determination ◽

Gradient Boosting ◽

Severe Drought ◽

Vegetation Growth ◽

Extreme Gradient Boosting ◽

Boosting Method

To forecast the terrestrial carbon cycle and monitor food security, vegetation growth must be accurately predicted; however, current process-based ecosystem and crop-growth models are limited in their effectiveness. This study developed a machine learning model using the extreme gradient boosting method to predict vegetation growth throughout the growing season in China from 2001 to 2018. The model used satellite-derived vegetation data for the first month of each growing season, CO2 concentration, and several meteorological factors as data sources for the explanatory variables. Results showed that the model could reproduce the spatiotemporal distribution of vegetation growth as represented by the satellite-derived normalized difference vegetation index (NDVI). The predictive error for the growing season NDVI was less than 5% for more than 98% of vegetated areas in China; the model represented seasonal variations in NDVI well. The coefficient of determination (R2) between the monthly observed and predicted NDVI was 0.83, and more than 69% of vegetated areas had an R2 > 0.8. The effectiveness of the model was examined for a severe drought year (2009), and results showed that the model could reproduce the spatiotemporal distribution of NDVI even under extreme conditions. This model provides an alternative method for predicting vegetation growth and has great potential for monitoring vegetation dynamics and crop growth.

Download Full-text

Machine Learning Approach for Predicting Lane-Change Maneuvers using the SHRP2 Naturalistic Driving Study Data

Transportation Research Record Journal of the Transportation Research Board ◽

10.1177/03611981211003581 ◽

2021 ◽

pp. 036119812110035

Author(s):

Anik Das ◽

Mohamed M. Ahmed

Keyword(s):

Machine Learning ◽

Prediction Accuracy ◽

Machine Learning Algorithms ◽

Support Vector ◽

Lane Change ◽

Adaptive Boosting ◽

Extreme Gradient Boosting ◽

Naturalistic Driving Study ◽

Naturalistic Driving ◽

Change Prediction

Accurate lane-change prediction information in real time is essential to safely operate Autonomous Vehicles (AVs) on the roadways, especially at the early stage of AVs deployment, where there will be an interaction between AVs and human-driven vehicles. This study proposed reliable lane-change prediction models considering features from vehicle kinematics, machine vision, driver, and roadway geometric characteristics using the trajectory-level SHRP2 Naturalistic Driving Study and Roadway Information Database. Several machine learning algorithms were trained, validated, tested, and comparatively analyzed including, Classification And Regression Trees (CART), Random Forest (RF), eXtreme Gradient Boosting (XGBoost), Adaptive Boosting (AdaBoost), Support Vector Machine (SVM), K Nearest Neighbor (KNN), and Naïve Bayes (NB) based on six different sets of features. In each feature set, relevant features were extracted through a wrapper-based algorithm named Boruta. The results showed that the XGBoost model outperformed all other models in relation to its highest overall prediction accuracy (97%) and F1-score (95.5%) considering all features. However, the highest overall prediction accuracy of 97.3% and F1-score of 95.9% were observed in the XGBoost model based on vehicle kinematics features. Moreover, it was found that XGBoost was the only model that achieved a reliable and balanced prediction performance across all six feature sets. Furthermore, a simplified XGBoost model was developed for each feature set considering the practical implementation of the model. The proposed prediction model could help in trajectory planning for AVs and could be used to develop more reliable advanced driver assistance systems (ADAS) in a cooperative connected and automated vehicle environment.

Download Full-text