scholarly journals Penerapan Bayesian Network dalam Memodelkan Kondisi Ekonomi Hijau Indonesia di Era Pandemi Berdasarkan Big Data

2021 ◽  
Vol 2021 (1) ◽  
pp. 1054-1064
Author(s):  
Salwa Rizqina Putri ◽  
Thosan Girisona Suganda ◽  
Setia Pramana

Untuk mendukung pertumbuhan ekonomi hijau Indonesia, diperlukan analisis lebih lanjut terkait aktivitas ekonomi di masa pandemi dan keterkaitannya dengan kondisi lingkungan. Penelitian ini bertujuan untuk menerapkan pendekatan Bayesian Network dalam memodelkan kondisi ekonomi hijau Indonesia di masa pandemi berdasarkan variabel-variabel yang disinyalir dapat berpengaruh seperti aktivitas ekonomi, kualitas udara, tingkat mobilitas penduduk, dan kasus positif COVID-19 yang diperoleh melalui big data. Model Bayesian Network yang dikonstruksi secara manual dengan algoritma Maximum Spanning Tree dipilih sebagai model terbaik dengan rata-rata akurasi 5-cross validation dalam memprediksi empat kelas PDRB sebesar 0,83. Model terbaik yang dipilih menunjukkan bahwa kondisi ekonomi Indonesia di era pandemi secara langsung dipengaruhi oleh intensitas cahaya malam (NTL) yang menunjukkan aktivitas ekonomi, kualitas udara (AQI), dan kasus positif COVID-19. Analisis parameter learning menunjukkan bahwa pertumbuhan ekonomi provinsi-provinsi Indonesia masih cenderung belum sejalan dengan terpeliharanya kualitas udara sehingga usaha untuk mencapai kondisi ekonomi hijau masih harus ditingkatkan.

BioScience ◽  
2018 ◽  
Vol 68 (9) ◽  
pp. 653-669 ◽  
Author(s):  
Debra P C Peters ◽  
N Dylan Burruss ◽  
Luis L Rodriguez ◽  
D Scott McVey ◽  
Emile H Elias ◽  
...  

2015 ◽  
Vol 30 (6) ◽  
pp. 1041-1071 ◽  
Author(s):  
Bi Yu Chen ◽  
Hui Yuan ◽  
Qingquan Li ◽  
Shih-Lung Shaw ◽  
William H.K. Lam ◽  
...  

BMJ Open ◽  
2021 ◽  
Vol 11 (12) ◽  
pp. e050146
Author(s):  
Jenna M Reps ◽  
Patrick Ryan ◽  
P R Rijnbeek

ObjectiveThe internal validation of prediction models aims to quantify the generalisability of a model. We aim to determine the impact, if any, that the choice of development and internal validation design has on the internal performance bias and model generalisability in big data (n~500 000).DesignRetrospective cohort.SettingPrimary and secondary care; three US claims databases.Participants1 200 769 patients pharmaceutically treated for their first occurrence of depression.MethodsWe investigated the impact of the development/validation design across 21 real-world prediction questions. Model discrimination and calibration were assessed. We trained LASSO logistic regression models using US claims data and internally validated the models using eight different designs: ‘no test/validation set’, ‘test/validation set’ and cross validation with 3-fold, 5-fold or 10-fold with and without a test set. We then externally validated each model in two new US claims databases. We estimated the internal validation bias per design by empirically comparing the differences between the estimated internal performance and external performance.ResultsThe differences between the models’ internal estimated performances and external performances were largest for the ‘no test/validation set’ design. This indicates even with large data the ‘no test/validation set’ design causes models to overfit. The seven alternative designs included some validation process to select the hyperparameters and a fair testing process to estimate internal performance. These designs had similar internal performance estimates and performed similarly when externally validated in the two external databases.ConclusionsEven with big data, it is important to use some validation process to select the optimal hyperparameters and fairly assess internal validation using a test set or cross-validation.


2021 ◽  
Vol 2083 (4) ◽  
pp. 042001
Author(s):  
Nan Zhang ◽  
Wenqiang Zhang ◽  
Yingnan Shang

Abstract The emergence of computer big data related data provides a new method for the construction of knowledge links in the knowledge map. This realizes an objective knowledge network with practical significance that is easier to be understood by machines. The article combines the four principles of linked data publishing content objects and their semantic characteristics, and uses the RDF data model to convert unstructured data on the Internet and structured data that adopts different standards into unified standard structured data for association. The system forms a huge knowledge map with semantics, intelligence, and dynamics.


2021 ◽  
Vol 2021 ◽  
pp. 1-9
Author(s):  
Yao Huimin

With the development of cloud computing and distributed cluster technology, the concept of big data has been expanded and extended in terms of capacity and value, and machine learning technology has also received unprecedented attention in recent years. Traditional machine learning algorithms cannot solve the problem of effective parallelization, so a parallelization support vector machine based on Spark big data platform is proposed. Firstly, the big data platform is designed with Lambda architecture, which is divided into three layers: Batch Layer, Serving Layer, and Speed Layer. Secondly, in order to improve the training efficiency of support vector machines on large-scale data, when merging two support vector machines, the “special points” other than support vectors are considered, that is, the points where the nonsupport vectors in one subset violate the training results of the other subset, and a cross-validation merging algorithm is proposed. Then, a parallelized support vector machine based on cross-validation is proposed, and the parallelization process of the support vector machine is realized on the Spark platform. Finally, experiments on different datasets verify the effectiveness and stability of the proposed method. Experimental results show that the proposed parallelized support vector machine has outstanding performance in speed-up ratio, training time, and prediction accuracy.


2020 ◽  
Vol 1 ◽  
pp. 1-23
Author(s):  
Majid Hojati ◽  
Colin Robertson

Abstract. With new forms of digital spatial data driving new applications for monitoring and understanding environmental change, there are growing demands on traditional GIS tools for spatial data storage, management and processing. Discrete Global Grid System (DGGS) are methods to tessellate globe into multiresolution grids, which represent a global spatial fabric capable of storing heterogeneous spatial data, and improved performance in data access, retrieval, and analysis. While DGGS-based GIS may hold potential for next-generation big data GIS platforms, few of studies have tried to implement them as a framework for operational spatial analysis. Cellular Automata (CA) is a classic dynamic modeling framework which has been used with traditional raster data model for various environmental modeling such as wildfire modeling, urban expansion modeling and so on. The main objectives of this paper are to (i) investigate the possibility of using DGGS for running dynamic spatial analysis, (ii) evaluate CA as a generic data model for dynamic phenomena modeling within a DGGS data model and (iii) evaluate an in-database approach for CA modelling. To do so, a case study into wildfire spread modelling is developed. Results demonstrate that using a DGGS data model not only provides the ability to integrate different data sources, but also provides a framework to do spatial analysis without using geometry-based analysis. This results in a simplified architecture and common spatial fabric to support development of a wide array of spatial algorithms. While considerable work remains to be done, CA modelling within a DGGS-based GIS is a robust and flexible modelling framework for big-data GIS analysis in an environmental monitoring context.


Sign in / Sign up

Export Citation Format

Share Document