boosting method
Recently Published Documents


TOTAL DOCUMENTS

163
(FIVE YEARS 87)

H-INDEX

13
(FIVE YEARS 4)

2022 ◽  
Vol 10 (4) ◽  
pp. 617-623
Author(s):  
Silvia Elsa Suryana ◽  
Budi Warsito ◽  
Suparti Suparti

Telemarketing is another form of marketing which is conducted via telephone. Bank can use telemarketing to offer its products such as term deposit. One of the most important strategy to the success of telemarketing is opting the potential customer to create effective telemarketing. Predicting the success of telemarketing can use machine learning. Gradient boosting is machine learning method with advanced decision tree. Gardient boosting involves many classification trees which are continually upgraded from previous tree. The optimal classification result cannot be separated from the role of the optimal hyperparameter.  Hyperopt is Python library that can be used to tune hyperparameter effectively because it uses Bayesian optimization. Hyperopt uses hyperparameter prior distribution to find optimal hyperparameter. Data in this study including 20 independent variables and binary dependent variable which has ‘yes’ and ‘no’ classes. The study showed that gradient boosting reached classification accuracy up to 90,39%, precision 94,91%, and AUC 0,939. These values describe gradient boosting method is able to predict both classes ‘yes’ and ‘no’ relatively accurate.


2021 ◽  
Vol 11 (2) ◽  
pp. 16-24
Author(s):  
Furkan Kayım ◽  
Atınç Yılmaz

In ancient times, trade was carried out by barter. With the use of money and similar means, the concept of financial instruments emerged. Financial instruments are tools and documents used in the economy. Financial instruments can be foreign exchange rates, securities, crypto currency, index and funds. There are many methods used in financial instrument forecast. These methods include technical analysis methods, basic analysis methods, forecasts carried out using variables and formulas, time-series algorithms and artificial intelligence algorithms. Within the scope of this study, the importance of the use of artificial intelligence algorithms in the financial instrument forecast is studied. Since financial instruments are used as a means of investment and trade by all sections of the society, namely individuals, families, institutions, and states, it is highly important to know about their future.  Financial instrument forecast can bring about profitability such as increased income welfare, more economical adjustment of maturities, creation of large finances, minimization of risks, spreading of ownership to the grassroots, and more balanced income distribution. Within the scope of this study, financial instrument forecast is carried out by applying a new methods of Long Short Term Memory (LSTM), Recurrent Neural Network (RNN), Convolutional Neural Network (CNN), Autoregressive Integrated Moving Average (ARIMA) algorithms and Ensemble Classification Boosting Method. Financial instrument forecast is carried out by creating a network compromising LSTM and RNN algorithm, an LSTM layer, and an RNN output layer. With the ensemble classification boosting method, a new method that gives a more successful result compared to the other algorithm forecast results was applied. At the conclusion of the study, alternative algorithm forecast results were competed against each other and the algorithm that gave the most successful forecast was suggested. The success rate of the forecast results was increased by comparing the results with different time intervals and training data sets. Furthermore, a new method was developed using the ensemble classification boosting method, and this method yielded a more successful result than the most successful algorithm result.


Author(s):  
Guangyuan Zhao ◽  
Yi Jiang ◽  
Shuo Li ◽  
Susan Tighe

Pavement friction has been identified as crucial in traffic safety. Since the Highway Safety Manual prediction algorithm is often based on crash frequency, the crash severity distribution might be assumed unchanged before and after the countermeasure. However, pavement surface treatments can improve the friction to different levels, by which crash severity outcomes may vary greatly. To explore the implicit effects of pavement friction on vehicle crash severity, this paper first validates the extreme gradient boosting model performance and then the Shapley additive explanations interaction values are employed to interpret individual features and the nonlinear interactions among predictors. Under various scenarios, the XGBoost output probability is utilized to convert into dynamic crash severity distributions. Results also indicate that friction becomes more significant when the friction number is less than 38, and immediate corrective actions are needed when the friction number is below 20.


Mathematics ◽  
2021 ◽  
Vol 9 (24) ◽  
pp. 3172
Author(s):  
Zeeshan Hameed ◽  
Waheed Ur Rehman ◽  
Wakeel Khan ◽  
Nasim Ullah ◽  
Fahad R. Albogamy

Parkinson’s disease (PD) is a progressive and long-term neurodegenerative disorder of the central nervous system. It has been studied that 90% of the PD subjects have voice impairments which are some of the vital characteristics of PD patients and have been widely used for diagnostic purposes. However, the curse of dimensionality, high aliasing, redundancy, and small sample size in PD speech data bring great challenges to classify PD objects. Feature reduction can efficiently solve these issues. However, existing feature reduction algorithms ignore high aliasing, noise, and the stability of algorithms, and thus fail to give substantial classification accuracy. To mitigate these problems, this study proposes a weighted hybrid feature reduction embedded with ensemble learning technique which comprises (1) hybrid feature reduction technique that increases inter-class variance, reduces intra-class variance, preserves the neighborhood structure of data, and remove co-related features that causes high aliasing and noise in classification. (2) Weighted-boosting method to train the model precisely. (3) Furthermore, the stability of the algorithm is enhanced by introducing a bagging strategy. The experiments were performed on three different datasets including two widely used datasets and a dataset provided by Southwest Hospital (Army Military Medical University) Chongqing, China. The experimental results indicated that compared with existing feature reduction methods, the proposed algorithm always shows the highest accuracy, precision, recall, and G-mean for speech data of PD. Moreover, the proposed algorithm not only shows excellent performance for classification but also deals with imbalanced data precisely and achieved the highest AUC in most of the cases. In addition, compared with state-of-the-art algorithms, the proposed method shows improvement up to 4.53%. In the future, this algorithm can be used for early and differential diagnoses, which are rated as challenging tasks.


BMJ Open ◽  
2021 ◽  
Vol 11 (12) ◽  
pp. e051925
Author(s):  
Clifford Silver Tarimo ◽  
Soumitra S Bhuyan ◽  
Quanman Li ◽  
Michael Johnson J Mahande ◽  
Jian Wu ◽  
...  

ObjectivesWe aimed at identifying the important variables for labour induction intervention and assessing the predictive performance of machine learning algorithms.SettingWe analysed the birth registry data from a referral hospital in northern Tanzania. Since July 2000, every birth at this facility has been recorded in a specific database.Participants21 578 deliveries between 2000 and 2015 were included. Deliveries that lacked information regarding the labour induction status were excluded.Primary outcomeDeliveries involving labour induction intervention.ResultsParity, maternal age, body mass index, gestational age and birth weight were all found to be important predictors of labour induction. Boosting method demonstrated the best discriminative performance (area under curve, AUC=0.75: 95% CI (0.73 to 0.76)) while logistic regression presented the least (AUC=0.71: 95% CI (0.70 to 0.73)). Random forest and boosting algorithms showed the highest net-benefits as per the decision curve analysis.ConclusionAll of the machine learning algorithms performed well in predicting the likelihood of labour induction intervention. Further optimisation of these classifiers through hyperparameter tuning may result in an improved performance. Extensive research into the performance of other classifier algorithms is warranted.


2021 ◽  
Vol 2021 ◽  
pp. 1-7
Author(s):  
Xichao Dai ◽  
Yumei Ding

In order to improve the accuracy of the evaluation results of multiperception intelligent wearable devices, the mathematical statistical characteristics based on speech, behavior, environment, and physical signs are proposed; first, the PCA feature compression algorithm was used to reduce the dimension of these features, and the differences among different training samples were compared and analyzed; then, three weak classifiers are designed using the logistic regression algorithm, and finally, a strong classifier with higher prediction accuracy is designed according to the boosting decision fusion method and ensemble learning idea. The results showed that the accuracy of the logistic regression model trained with the feature data of voice PCA was 0.964, but the recall rate and crossover results were significantly reduced to 0.844 and 0.846, respectively. The accuracy, accuracy and recall of the decision fusion model based on the boosting method and integrated learning are 0.969, and the prediction accuracy of K-folds cross-validation is also as high as 0.956; the superposition fusion results of three weak classifiers achieve a better classification effect.


2021 ◽  
Vol 11 (22) ◽  
pp. 10562
Author(s):  
Raymond Ghandour ◽  
Albert Jose Potams ◽  
Ilyes Boulkaibet ◽  
Bilel Neji ◽  
Zaher Al Barakeh

Distraction while driving occurs when a driver is engaged in non-driving activities. These activities reduce the driver’s attention and focus on the road, therefore increasing the risk of accidents. As a consequence, the number of accidents increases and infrastructure is damaged. Cars are now equipped with different safety precautions that ensure driver awareness and attention at all times. The first step for such systems is to define whether the driver is distracted or not. Different methods are proposed to detect such distractions, but they lack efficiency when tested in real-life situations. In this paper, four machine learning classification methods are implemented and compared to identify drivers’ behavior and distraction situations based on real data corresponding to different behaviors such as aggressive, drowsy and normal. The data were randomized for a better application of the methods. We demonstrate that the gradient boosting method outperforms the other used classifiers.


2021 ◽  
Vol 2072 (1) ◽  
pp. 012005
Author(s):  
M Sumanto ◽  
M A Martoprawiro ◽  
A L Ivansyah

Abstract Machine Learning is an artificial intelligence system, where the system has the ability to learn automatically from experience without being explicitly programmed. The learning process from Machine Learning starts from observing the data and then looking at the pattern of the data. The main purpose of this process is to make computers learn automatically. In this study, we will use Machine Learning to predict molecular atomization energy. From various methods in Machine Learning, we use two methods namely Neural Network and Extreme Gradient Boosting. Both methods have several parameters that must be adjusted so that the predicted value of the atomization energy of the molecule has the lowest possible error. We are trying to find the right parameter values for both methods. For the neural network method, it is quite difficult to find the right parameter value because it takes a long time to train the model of the neural network to find out whether the model is good or bad, while for the Extreme Gradient Boosting method the time needed to train the model is shorter, so it is quite easy to find the right parameter values for the model. This study also looked at the effects of the modification on the dataset with the output transformation of normalization and standardization then removing molecules containing Br atoms and changing the entry in the Coulomb matrix to 0 if the distance between atoms in the molecule exceeds 2 angstrom.


2021 ◽  
Author(s):  
Mu Yue

In high-dimensional data, penalized regression is often used for variable selection and parameter estimation. However, these methods typically require time-consuming cross-validation methods to select tuning parameters and retain more false positives under high dimensionality. This chapter discusses sparse boosting based machine learning methods in the following high-dimensional problems. First, a sparse boosting method to select important biomarkers is studied for the right censored survival data with high-dimensional biomarkers. Then, a two-step sparse boosting method to carry out the variable selection and the model-based prediction is studied for the high-dimensional longitudinal observations measured repeatedly over time. Finally, a multi-step sparse boosting method to identify patient subgroups that exhibit different treatment effects is studied for the high-dimensional dense longitudinal observations. This chapter intends to solve the problem of how to improve the accuracy and calculation speed of variable selection and parameter estimation in high-dimensional data. It aims to expand the application scope of sparse boosting and develop new methods of high-dimensional survival analysis, longitudinal data analysis, and subgroup analysis, which has great application prospects.


2021 ◽  
Vol 12 (7) ◽  
pp. 358-372
Author(s):  
E. V. Orlova ◽  

The article considers the problem of reducing the banks credit risks associated with the insolvency of borrowers — individuals using financial, socio-economic factors and additional data about borrowers digital footprint. A critical analysis of existing approaches, methods and models in this area has been carried out and a number of significant shortcomings identified that limit their application. There is no comprehensive approach to identifying a borrowers creditworthiness based on information, including data from social networks and search engines. The new methodological approach for assessing the borrowers risk profile based on the phased processing of quantitative and qualitative data and modeling using methods of statistical analysis and machine learning is proposed. Machine learning methods are supposed to solve clustering and classification problems. They allow to automatically determine the data structure and make decisions through flexible and local training on the data. The method of hierarchical clustering and the k-means method are used to identify similar social, anthropometric and financial indicators, as well as indicators characterizing the digital footprint of borrowers, and to determine the borrowers risk profile over group. The obtained homogeneous groups of borrowers with a unique risk profile are further used for detailed data analysis in the predictive classification model. The classification model is based on the stochastic gradient boosting method to predict the risk profile of a potencial borrower. The suggested approach for individuals creditworthiness assessing will reduce the banks credit risks, increase its stability and profitability. The implementation results are of practical importance. Comparative analysis of the effectiveness of the existing and the proposed methodology for assessing credit risk showed that the new methodology provides predictive ana­lytics of heterogeneous information about a potential borrower and the accuracy of analytics is higher. The proposed techniques are the core for the decision support system for justification of individuals credit conditions, minimizing the aggregate credit risks.


Sign in / Sign up

Export Citation Format

Share Document