Trust model simulation of cross border e-commerce based on machine learning and Bayesian network

Author(s):  
Fenghua Zhang ◽  
Yang Yang
2019 ◽  
Vol 11 (22) ◽  
pp. 6416 ◽  
Author(s):  
Ouyang ◽  
Wang ◽  
Zhu

Coordinating ecosystem service supply and demand equilibrium and utilizing machine learning to dynamically construct an ecological security pattern (ESP) can help better understand the impact of urban development on ecological processes, which can be used as a theoretical reference in coupling economic growth and environmental protection. Here, the ESP of the Changsha–Zhuzhou–Xiangtan urban agglomeration was constructed, which made use of the Bayesian network model to dynamically identify the ecological sources. The ecological corridor and ecological strategy points were identified using the minimum cumulative resistance model and circuit theory. The ESP was constructed by combining seven ecological sources, “two horizontal and three vertical” ecological corridors, and 37 ecological strategy points. Our results found spatial decoupling between the supply and demand of ecosystem services (ES) and the degradation in areas with high demand for ES. The ecological sources and ecological corridors of the urban agglomeration were mainly situated in forestlands and water areas. The terrestrial ecological corridor was distributed along the outer periphery of the urban agglomeration, while the aquatic ecological corridor ran from north to south throughout the entire region. The ecological strategic points were mainly concentrated along the boundaries of the built-up area and the intersection between construction land and ecological land. Finally, the ecological sources were found primarily on existing ecological protection zones, which supports the usefulness of machine learning in predicting ecological sources and may provide new insights in developing urban ESP.


10.2196/18910 ◽  
2020 ◽  
Vol 8 (7) ◽  
pp. e18910
Author(s):  
Debbie Rankin ◽  
Michaela Black ◽  
Raymond Bond ◽  
Jonathan Wallace ◽  
Maurice Mulvenna ◽  
...  

Background The exploitation of synthetic data in health care is at an early stage. Synthetic data could unlock the potential within health care datasets that are too sensitive for release. Several synthetic data generators have been developed to date; however, studies evaluating their efficacy and generalizability are scarce. Objective This work sets out to understand the difference in performance of supervised machine learning models trained on synthetic data compared with those trained on real data. Methods A total of 19 open health datasets were selected for experimental work. Synthetic data were generated using three synthetic data generators that apply classification and regression trees, parametric, and Bayesian network approaches. Real and synthetic data were used (separately) to train five supervised machine learning models: stochastic gradient descent, decision tree, k-nearest neighbors, random forest, and support vector machine. Models were tested only on real data to determine whether a model developed by training on synthetic data can used to accurately classify new, real examples. The impact of statistical disclosure control on model performance was also assessed. Results A total of 92% of models trained on synthetic data have lower accuracy than those trained on real data. Tree-based models trained on synthetic data have deviations in accuracy from models trained on real data of 0.177 (18%) to 0.193 (19%), while other models have lower deviations of 0.058 (6%) to 0.072 (7%). The winning classifier when trained and tested on real data versus models trained on synthetic data and tested on real data is the same in 26% (5/19) of cases for classification and regression tree and parametric synthetic data and in 21% (4/19) of cases for Bayesian network-generated synthetic data. Tree-based models perform best with real data and are the winning classifier in 95% (18/19) of cases. This is not the case for models trained on synthetic data. When tree-based models are not considered, the winning classifier for real and synthetic data is matched in 74% (14/19), 53% (10/19), and 68% (13/19) of cases for classification and regression tree, parametric, and Bayesian network synthetic data, respectively. Statistical disclosure control methods did not have a notable impact on data utility. Conclusions The results of this study are promising with small decreases in accuracy observed in models trained with synthetic data compared with models trained with real data, where both are tested on real data. Such deviations are expected and manageable. Tree-based classifiers have some sensitivity to synthetic data, and the underlying cause requires further investigation. This study highlights the potential of synthetic data and the need for further evaluation of their robustness. Synthetic data must ensure individual privacy and data utility are preserved in order to instill confidence in health care departments when using such data to inform policy decision-making.


2019 ◽  
Vol 19 (1) ◽  
pp. 17
Author(s):  
Sukmawati Anggraeni Putri

Proses prediksi cacat software telah menjadi bagian penting pada proses pengujian kualitas software. Penelitian ini berfungsi sebagai alternatif bagi praktisi software untuk menentukan prioritas modul software yang akan diuji. Sehingga dapat mengurangi biaya maupun waktu dalam pengujian kualitas software, Sebagai percobaannya, sejak awal para peneliti pada bidang prediksi cacat perangkat lunak ini menggunakan dataset NASA MDP yang bersifat publik. Tetapi, dataset ini memiliki dua kekurangan seperti noise atribut dan ketidak seimbangan kelas. Permasalahan noise atribute dapat diatasi menggunakan algoritma seleksi fitur, seperti Chi Square dan Information Gain. Sementara, permasalahan ketidak seimbangan kelas dapat diatasi menggunakan teknik sampel, seperti RUS (Random Undersampling) dan SMOTE (Synthetic  Minority  Over-sampling Technique). Sehingga pada penelitian ini dilakukan integrasi antara teknik sampel (RUS dan SMOTE) pada algoritma pemilihan atribut (algoritma Information Gain) yang diterapkan pada machine learning Bayesian Network. Machine learning Bayesian Network menurut Lessman merupakan  pengklasifikasi statistik yang  memiliki performa  yang  baik  pada  proses  klasifikasi. Dari hasil percobaan yang dilakukan di empat dataset NASA MDP diperoleh hasil bahwa model SMOTE + IG dapat meningkatkan akurasi pengklasifikasi Bayesian Network hingga rata-rata 0.912 dari 4 dataset NASA MDP yang digunakan.


Author(s):  
Daisuke Kitakoshi ◽  
◽  
Hiroyuki Shioya ◽  
Masahito Kurihara ◽  

Reinforcement learning (RL) is a kind of machine learning. It aims to optimize agents’ policies by adapting the agents to an environment according to rewards. In this paper, we propose a method for improving policies by using stochastic knowledge, in which reinforcement learning agents obtain. We use a Bayesian Network (BN), which is a stochastic model, as knowledge of an agent. Its structure is decided by minimum description length criterion using series of an agent’s input-output and rewards as sample data. A BN constructed in our study represents stochastic dependences between input-output and rewards. In our proposed method, policies are improved by supervised learning using the structure of BN (i.e. stochastic knowledge). The proposed improvement mechanism makes RL agents acquire more effective policies. We carry out simulations in the pursuit problem in order to show the effectiveness of our proposed method.


Sign in / Sign up

Export Citation Format

Share Document