An Improved Hybrid Model base on SVM and Random Forest for the Prediction of Corporate Taxation

Author(s):  
Hui Xu ◽  
Mingyang Ma
2020 ◽  
Vol 12 (6) ◽  
pp. 2218 ◽  
Author(s):  
Binh Thai Pham ◽  
Chongchong Qi ◽  
Lanh Si Ho ◽  
Trung Nguyen-Thoi ◽  
Nadhir Al-Ansari ◽  
...  

Determination of shear strength of soil is very important in civil engineering for foundation design, earth and rock fill dam design, highway and airfield design, stability of slopes and cuts, and in the design of coastal structures. In this study, a novel hybrid soft computing model (RF-PSO) of random forest (RF) and particle swarm optimization (PSO) was developed and used to estimate the undrained shear strength of soil based on the clay content (%), moisture content (%), specific gravity (%), void ratio (%), liquid limit (%), and plastic limit (%). In this study, the experimental results of 127 soil samples from national highway project Hai Phong-Thai Binh of Vietnam were used to generate datasets for training and validating models. Pearson correlation coefficient (R) method was used to evaluate and compare performance of the proposed model with single RF model. The results show that the proposed hybrid model (RF-PSO) achieved a high accuracy performance (R = 0.89) in the prediction of shear strength of soil. Validation of the models also indicated that RF-PSO model (R = 0.89 and Root Mean Square Error (RMSE) = 0.453) is superior to the single RF model without optimization (R = 0.87 and RMSE = 0.48). Thus, the proposed hybrid model (RF-PSO) can be used for accurate estimation of shear strength which can be used for the suitable designing of civil engineering structures.


Sensors ◽  
2021 ◽  
Vol 21 (23) ◽  
pp. 7796
Author(s):  
Tao Hu ◽  
Yuman Sun ◽  
Weiwei Jia ◽  
Dandan Li ◽  
Maosheng Zou ◽  
...  

We performed a comparative analysis of the prediction accuracy of machine learning methods and ordinary Kriging (OK) hybrid methods for forest volume models based on multi-source remote sensing data combined with ground survey data. Taking Larix olgensis, Pinus koraiensis, and Pinus sylvestris plantations in Mengjiagang forest farms as the research object, based on the Chinese Academy of Forestry LiDAR, charge-coupled device, and hyperspectral (CAF-LiTCHy) integrated system, we extracted the visible vegetation index, texture features, terrain factors, and point cloud feature variables, respectively. Random forest (RF), support vector regression (SVR), and an artificial neural network (ANN) were used to estimate forest volume. In the small-scale space, the estimation of sample plot volume is influenced by the surrounding environment as well as the neighboring observed data. Based on the residuals of these three machine learning models, OK interpolation was applied to construct new hybrid forest volume estimation models called random forest Kriging (RFK), support vector machines for regression Kriging (SVRK), and artificial neural network Kriging (ANNK). The six estimation models of forest volume were tested using the leave-one-out (Loo) cross-validation method. The prediction accuracies of these six models are better, with RLoo2 values above 0.6, and the prediction accuracy values of the hybrid models are all improved to different extents. Among the six models, the RFK hybrid model had the best prediction effect, with an RLoo2 reaching 0.915. Therefore, the machine learning method based on multi-source remote sensing factors is useful for forest volume estimation; in particular, the hybrid model constructed by combining machine learning and the OK method greatly improved the accuracy of forest volume estimation, which, thus, provides a fast and effective method for the remote sensing inversion estimation of forest volume and facilitates the management of forest resources.


2021 ◽  
Vol 6 (1) ◽  
pp. 295-309
Author(s):  
Daniel Vassallo ◽  
Raghavendra Krishnamurthy ◽  
Harindra J. S. Fernando

Abstract. Machine learning is quickly becoming a commonly used technique for wind speed and power forecasting. Many machine learning methods utilize exogenous variables as input features, but there remains the question of which atmospheric variables are most beneficial for forecasting, especially in handling non-linearities that lead to forecasting error. This question is addressed via creation of a hybrid model that utilizes an autoregressive integrated moving-average (ARIMA) model to make an initial wind speed forecast followed by a random forest model that attempts to predict the ARIMA forecasting error using knowledge of exogenous atmospheric variables. Variables conveying information about atmospheric stability and turbulence as well as inertial forcing are found to be useful in dealing with non-linear error prediction. Streamwise wind speed, time of day, turbulence intensity, turbulent heat flux, vertical velocity, and wind direction are found to be particularly useful when used in unison for hourly and 3 h timescales. The prediction accuracy of the developed ARIMA–random forest hybrid model is compared to that of the persistence and bias-corrected ARIMA models. The ARIMA–random forest model is shown to improve upon the latter commonly employed modeling methods, reducing hourly forecasting error by up to 5 % below that of the bias-corrected ARIMA model and achieving an R2 value of 0.84 with true wind speed.


Author(s):  
Zannatul Ferdoush ◽  
Booshra Nazifa Mahmud ◽  
Amitabha Chakrabarty ◽  
Jia Uddin

In the presence of the deregulated electric industry, load forecasting is more demanded than ever to ensure the execution of applications such as energy generation, pricing decisions, resource procurement, and infrastructure development. This paper presents a hybrid machine learning model for short-term load forecasting (STLF) by applying random forest and bidirectional long short-term memory to acquire the benefits of both methods. In the experimental evaluation, we used a Bangladeshi electricity consumption dataset of 36 months. The paper provides a comparative study between the proposed hybrid model and state-of-art models using performance metrics, loss analysis, and prediction plotting. Empirical results demonstrate that the hybrid model shows better performance than the standard long short-term memory and the bidirectional long short-term memory models by exhibiting more accurate forecast results.


2019 ◽  
Vol 8 (4) ◽  
pp. 5054-5058

Malicious threats are better known by their work of damages. This damages are not just limited to the system, but it might lead to significant information damage too. Along with this, threats are also responsible for financial loss. As technology increases, Types and attacks of threats also increases. Though the research community investigated a number of cyber attack prevention models it is challenging to detect the threat and preventing them from data, for the industries. Detection of the attacks with IDS is common and popular in organizations . Now a days data mining and hybrid approaches are getting priority combine with IDS in the area of anomalies and attack detection. In this paper, we focus on the designing a tool based on signature approach and the random forest algorithm for intrusion detection that offers data security and protection. Both algorithm works individually for IDS system but signature base algorithm have some limitations of known database requirement. In our research paper, we proposed a Hybrid intrusion detection model which allows us to double filtration of the intrusions in the application with implementation of combine signature and behavior based algorithm in one system. This paper addresses the various kinds of feature and the behavior of the threat and their different functioning further intrusion detection hybrid model is the extension for the simple individual model who work on either behavior or on signature.


Author(s):  
Tamilarasi Suresh ◽  
Tsehay Admassu Assegie ◽  
Subhashni Rajkumar ◽  
Napa Komal Kumar

Heart disease is one of the most widely spreading and deadliest diseases across the world. In this study, we have proposed hybrid model for heart disease prediction by employing random forest and support vector machine. With random forest, iterative feature elimination is carried out to select heart disease features that improves predictive outcome of support vector machine for heart disease prediction. Experiment is conducted on the proposed model using test set and the experimental result evidently appears to prove that the performance of the proposed hybrid model is better as compared to an individual random forest and support vector machine. Overall, we have developed more accurate and computationally efficient model for heart disease prediction with accuracy of 98.3%. Moreover, experiment is conducted to analyze the effect of regularization parameter (C) and gamma on the performance of support vector machine. The experimental result evidently reveals that support vector machine is very sensitive to C and gamma.


Sign in / Sign up

Export Citation Format

Share Document