ensemble model
Recently Published Documents


TOTAL DOCUMENTS

1009
(FIVE YEARS 626)

H-INDEX

45
(FIVE YEARS 12)

2022 ◽  
Vol 204 ◽  
pp. 111960
Author(s):  
Zhihao Jin ◽  
Yiqun Ma ◽  
Lingzhi Chu ◽  
Yang Liu ◽  
Robert Dubrow ◽  
...  

2022 ◽  
Vol 40 (3) ◽  
pp. 1-28
Author(s):  
Yadong Zhu ◽  
Xiliang Wang ◽  
Qing Li ◽  
Tianjun Yao ◽  
Shangsong Liang

Mobile advertising has undoubtedly become one of the fastest-growing industries in the world. The influx of capital attracts increasing fraudsters to defraud money from advertisers. Fraudsters can leverage many techniques, where bots install fraud is the most difficult to detect due to its ability to emulate normal users by implementing sophisticated behavioral patterns to evade from detection rules defined by human experts. Therefore, we proposed BotSpot 1 for bots install fraud detection previously. However, there are some drawbacks in BotSpot, such as the sparsity of the devices’ neighbors, weak interactive information of leaf nodes, and noisy labels. In this work, we propose BotSpot++ to improve these drawbacks: (1) for the sparsity of the devices’ neighbors, we propose to construct a super device node to enrich the graph structure and information flow utilizing domain knowledge and a clustering algorithm; (2) for the weak interactive information, we propose to incorporate a self-attention mechanism to enhance the interaction of various leaf nodes; and (3) for the noisy labels, we apply a label smoothing mechanism to alleviate it. Comprehensive experimental results show that BotSpot++ yields the best performance compared with six state-of-the-art baselines. Furthermore, we deploy our model to the advertising platform of Mobvista, 2 a leading global mobile advertising company. The online experiments also demonstrate the effectiveness of our proposed method.


2022 ◽  
Vol 72 ◽  
pp. 103279
Author(s):  
S. Nanglia ◽  
Muneer Ahmad ◽  
Fawad Ali Khan ◽  
N.Z. Jhanjhi

2022 ◽  
Author(s):  
Carmelo Bonannella ◽  
Tomislav Hengl ◽  
Johannes Heisig ◽  
Leandro Parente ◽  
Marvin N Wright ◽  
...  

Abstract Paper describes a data-driven framework based on spatio-temporal ensemble machine learning to produce distribution maps for 16 forest tree species (Abies alba Mill., Castanea sativa Mill. , Corylus avellana L., Fagus sylvatica L., Olea europaea L., Picea abies L. H. Karst., Pinus halepensis Mill., Pinus nigra J. F. Arnold, Pinus pinea L., Pinus sylvestris L., Prunus avium L., Quercus cerris L., Quercus ilex L., Quercus robur L., Quercus suber L. and Salix caprea L.) at high spatial resolution (30 m). Tree occurrence data for a total of 3 million of points was used to train different Machine Learning (ML) algorithms: random forest, gradient-boosted trees, generalized linear models, k-nearest neighbors, CART and an artificial neural network. A stack of 585 coarse and high resolution covariates representing spectral reflectance (Landsat bands, spectral indices; time-series of seasonal composites), different biophysical conditions (i.e. temperature, precipitation, elevation, lithology) and biotic competition (other species distribution maps) was used as predictors for realized distributions, while potential distribution was modelled with environmental predictors only. Logloss and computing time were used to select the three best algorithms to train an ensemble model based on stacking with a logistic regressor as a meta-learner for each species. High resolution (30 m) probability and model uncertainty maps of realized distribution were produced for each species using a time window of 4 years for a total of 6 distribution maps per species for the studied period, while for potential distributions only one map per species was produced. Results of spatial cross validation show that Olea europaea and Quercus suber achieved the best performances in both potential and realized distribution, while Pinus sylvestris and Salix caprea achieved the worst. Further analysis shows that fine-resolution models consistently outperformed coarse resolution models (250 m) for realized distribution (average decrease in logloss: +53%). Realized distribution models achieved higher predictive performances than potential distribution ones. Importance of predictor variables differed across species and models, with the green band for summer and the NDWI and NDVI for fall for realized distribution and the diffuse irradiation and precipitation of the driest quarter being the most important and frequent for potential distribution. The ensemble model outperformed or performed as good as the best individual model in all potential species distributions, while for ten species it performed worse than the best individual model in modeling realized distributions. The framework shows how combining continuous and consistent EO time series data with state of the art ML can be used to derive dynamic distribution maps. The produced time-series occurrence predictions can be used to quantify temporal trends and detect potential forest degradation.


2022 ◽  
Author(s):  
Michael R Stukel ◽  
Moira Decima ◽  
Micahel R Landry

The ability to constrain the mechanisms that transport organic carbon into the deep ocean is complicated by the multiple physical, chemical, and ecological processes that intersect to create, transform, and transport particles in the ocean. In this manuscript we develop and parameterize a data-assimilative model of the multiple pathways of the biological carbon pump (NEMUROBCP). The mechanistic model is designed to represent sinking particle flux, active transport by vertically migrating zooplankton, and passive transport by subduction and vertical mixing, while also explicitly representing multiple biological and chemical properties measured directly in the field (including nutrients, phytoplankton and zooplankton taxa, carbon dioxide and oxygen, nitrogen isotopes, and 234Thorium). Using 30 different data types (including standing stock and rate measurements related to nutrients, phytoplankton, zooplankton, and non-living organic matter) from Lagrangian experiments conducted on 11 cruises from four ocean regions, we conduct an objective statistical parameterization of the model and generate one million different potential parameter sets that are used for ensemble model simulations. The model simulates in situ parameters that were assimilated (net primary production and gravitational particle flux) and parameters that were withheld (234Thorium and nitrogen isotopes) with reasonable accuracy. Model results show that gravitational flux of sinking particles and vertical mixing of organic matter from the surface ocean are more important biological pump pathways than active transport by vertically-migrating zooplankton. However, these processes are regionally variable, with sinking particles most important in oligotrophic areas of the Gulf of Mexico and California, sinking particles and vertical mixing roughly equivalent in productive regions of the CCE and the subtropical front in the Southern Ocean, and active transport an important contributor in the Eastern Tropical Pacific. We further find that mortality at depth is an important component of active transport when mesozooplankton biomasses are high, but that it is negligible in regions with low mesozooplankton biomass. Our results also highlight the high degree of uncertainty, particularly amongst mesozooplankton functional groups, that is derived from uncertainty in model parameters, with important implications from results that rely on non-ensemble model outputs. We also discuss the implications of our results for other data assimilation approaches.


2022 ◽  
Author(s):  
Selcuk Cankurt ◽  
Abdulhamit Subasi

AbstractOver the last decades, several soft computing techniques have been applied to tourism demand forecasting. Among these techniques, a neuro-fuzzy model of ANFIS (adaptive neuro-fuzzy inference system) has started to emerge. A conventional ANFIS model cannot deal with the large dimension of a dataset, and cannot work with our dataset, which is composed of a 62 time-series, as well. This study attempts to develop an ensemble model by incorporating neural networks with ANFIS to deal with a large number of input variables for multivariate forecasting. Our proposed approach is a collaboration of two base learners, which are types of the neural network models and a meta-learner of ANFIS in the framework of the stacking ensemble. The results show that the stacking ensemble of ANFIS (meta-learner) and ANN models (base learners) outperforms its stand-alone counterparts of base learners. Numerical results indicate that the proposed ensemble model achieved a MAPE of 7.26% compared to its single-instance ANN models with MAPEs of 8.50 and 9.18%, respectively. Finally, this study which is a novel application of the ensemble systems in the context of tourism demand forecasting has shown better results compared to those of the single expert systems based on the artificial neural networks.


2022 ◽  
Author(s):  
Christopher Graney-Ward ◽  
Biju Issac ◽  
LIDA KETSBAIA ◽  
Seibu Mary Jacob

Due to the recent popularity and growth of social media platforms such as Facebook and Twitter, cyberbullying is becoming more and more prevalent. The current research on cyberbullying and the NLP techniques being used to classify this kind of online behaviour was initially studied. This paper discusses the experimentation with combined Twitter datasets by Maryland and Cornell universities using different classification approaches like classical machine learning, RNN, CNN, and pretrained transformer-based classifiers. A state of the art (SOTA) solution was achieved by optimising BERTweet on a Onecycle policy with a Decoupled weight decay optimiser (AdamW), improving the previous F1-score by up to 8.4%, resulting in 64.8% macro F1. Particle Swarm Optimisation was later used to optimise the ensemble model. The ensemble developed from the optimised BERTweet model and a collection of models with varying data representations, outperformed the standalone BERTweet model by 0.53% resulting in 65.33% macro F1 for TweetEval dataset and by 0.55% for combined datasets, resulting in 68.1% macro F1.


Sign in / Sign up

Export Citation Format

Share Document