Proactive Data Allocation in Distributed Datasets based on an Ensemble Model

Author(s):  
Theocharis Koukaras ◽  
Kostas Kolomvatsos
2005 ◽  
Vol 4 (2) ◽  
pp. 393-400
Author(s):  
Pallavali Radha ◽  
G. Sireesha

The data distributors work is to give sensitive data to a set of presumably trusted third party agents.The data i.e., sent to these third parties are available on the unauthorized places like web and or some ones systems, due to data leakage. The distributor must know the way the data was leaked from one or more agents instead of as opposed to having been independently gathered by other means. Our new proposal on data allocation strategies will improve the probability of identifying leakages along with Security attacks typically result from unintended behaviors or invalid inputs.  Due to too many invalid inputs in the real world programs is labor intensive about security testing.The most desirable thing is to automate or partially automate security-testing process. In this paper we represented Predicate/ Transition nets approach for security tests automated generationby using formal threat models to detect the agents using allocation strategies without modifying the original data.The guilty agent is the one who leaks the distributed data. To detect guilty agents more effectively the idea is to distribute the data intelligently to agents based on sample data request and explicit data request. The fake object implementation algorithms will improve the distributor chance of detecting guilty agents.


2021 ◽  
Vol 40 (5) ◽  
pp. 9471-9484
Author(s):  
Yilun Jin ◽  
Yanan Liu ◽  
Wenyu Zhang ◽  
Shuai Zhang ◽  
Yu Lou

With the advancement of machine learning, credit scoring can be performed better. As one of the widely recognized machine learning methods, ensemble learning has demonstrated significant improvements in the predictive accuracy over individual machine learning models for credit scoring. This study proposes a novel multi-stage ensemble model with multiple K-means-based selective undersampling for credit scoring. First, a new multiple K-means-based undersampling method is proposed to deal with the imbalanced data. Then, a new selective sampling mechanism is proposed to select the better-performing base classifiers adaptively. Finally, a new feature-enhanced stacking method is proposed to construct an effective ensemble model by composing the shortlisted base classifiers. In the experiments, four datasets with four evaluation indicators are used to evaluate the performance of the proposed model, and the experimental results prove the superiority of the proposed model over other benchmark models.


Electronics ◽  
2021 ◽  
Vol 10 (7) ◽  
pp. 861
Author(s):  
Kyeung Ho Kang ◽  
Mingu Kang ◽  
Siho Shin ◽  
Jaehyo Jung ◽  
Meina Li

Chronic diseases, such as coronary artery disease and diabetes, are caused by inadequate physical activity and are the leading cause of increasing mortality and morbidity rates. Direct calorimetry by calorie production and indirect calorimetry by energy expenditure (EE) has been regarded as the best method for estimating the physical activity and EE. However, this method is inconvenient, owing to the use of an oxygen respiration measurement mask. In this study, we propose a model that estimates physical activity EE using an ensemble model that combines artificial neural networks and genetic algorithms using the data acquired from patch-type sensors. The proposed ensemble model achieved an accuracy of more than 92% (Root Mean Squared Error (RMSE) = 0.1893, R2 = 0.91, Mean Squared Error (MSE) = 0.014213, Mean Absolute Error (MAE) = 0.14020) by testing various structures through repeated experiments.


Sign in / Sign up

Export Citation Format

Share Document