scholarly journals Earthquake Time Prediction using CatBoost and SVR

Seismic tremors everywhere throughout the globe have been a noteworthy reason for decimation and death toll and property. The following context expects to recognize earthquakes at a beginning time utilizing AI. This will help individuals and salvage groups to make their errand simpler. The information in this manner comprises of these seismic acoustic signals and the time of failure. The model is then prepared utilizing the CatBoost model and the utilization of Support Vector Machines. This will help foresee the time at which a Seismic tremor may happen. CatBoost Regression Algorithm gives a Mean Absolute Error of about 1.860. The Cross Validation (CV) Score for the Support Vector Machine (SVM) approach is -2.1651. The datasets metrics are not reliable on any outer parameter in this manner the variety of exactness is constrained, and high accuracy is accomplished.

2021 ◽  
Vol 2021 ◽  
pp. 1-9
Author(s):  
Yao Huimin

With the development of cloud computing and distributed cluster technology, the concept of big data has been expanded and extended in terms of capacity and value, and machine learning technology has also received unprecedented attention in recent years. Traditional machine learning algorithms cannot solve the problem of effective parallelization, so a parallelization support vector machine based on Spark big data platform is proposed. Firstly, the big data platform is designed with Lambda architecture, which is divided into three layers: Batch Layer, Serving Layer, and Speed Layer. Secondly, in order to improve the training efficiency of support vector machines on large-scale data, when merging two support vector machines, the “special points” other than support vectors are considered, that is, the points where the nonsupport vectors in one subset violate the training results of the other subset, and a cross-validation merging algorithm is proposed. Then, a parallelized support vector machine based on cross-validation is proposed, and the parallelization process of the support vector machine is realized on the Spark platform. Finally, experiments on different datasets verify the effectiveness and stability of the proposed method. Experimental results show that the proposed parallelized support vector machine has outstanding performance in speed-up ratio, training time, and prediction accuracy.


Author(s):  
Jasleen Kaur ◽  
Khushdeep Dharni

Uniqueness in economies and stock markets has given rise to an interesting domain of exploring data mining techniques across global indices. Previously, very few studies have attempted to compare the performance of data mining techniques in diverse markets. The current study adds to the understanding regarding the variations in performance of data mining techniques across the global stock indices. We compared the performance of Neural Networks and Support Vector Machines using accuracy measures Mean Absolute Error (MAE) and R­­­­oot Mean Square Error (RMSE) across seven major stock markets. For prediction purpose, technical analysis has been employed on selected indicators based on daily values of indices spanning a period of 12 years. We created 196 data sets spanning different time periods for model building such as 1 year, 2 years, 3 years, 4 years, 6 years and 12 years for selected seven stock indices. Based on prediction models built using Neural Networks and Support Vector Machines, the findings of the study indicate there is a significant difference, both for MAE and RMSE, across the selected global indices. Also, Mean Absolute Error and Root Mean Square Error of models built using NN were greater than Mean Absolute Error and Root Mean Square Error of models built using SVM.


2018 ◽  
Vol 14 (2) ◽  
pp. 225
Author(s):  
Indriyanti Indriyanti ◽  
Agus Subekti

Konsumsi energi bangunan yang semakin meningkat mendorong para peneliti untuk membangun sebuah model prediksi dengan menerapkan metode machine learning, namun masih belum diketahui model yang paling akurat. Model prediktif untuk konsumsi energi bangunan komersial penting untuk konservasi energi. Dengan menggunakan model yang tepat, kita dapat membuat desain bangunan yang lebih efisien dalam penggunaan energi. Dalam tulisan ini, kami mengusulkan model prediktif berdasarkan metode pembelajaran mesin untuk mendapatkan model terbaik dalam memprediksi total konsumsi energi. Algoritma yang digunakan yaitu SMOreg dan LibSVM dari kelas Support Vector Machine, kemudian untuk evaluasi model berdasarkan nilai Mean Absolute Error dan Root Mean Square Error. Dengan menggunakan dataset publik yang tersedia, kami mengembangkan model berdasarkan pada mesin vektor pendukung untuk regresi. Hasil pengujian kedua algoritma tersebut diketahui bahwa algoritma SMOreg memiliki akurasi lebih baik karena memiliki nilai MAE dan RMSE sebesar 4,70 dan 10,15, sedangkan untuk model LibSVM memiliki nilai MAE dan RMSE sebesar 9,37 dan 14,45. Kami mengusulkan metode berdasarkan algoritma SMOreg karena kinerjanya lebih baik.


Algorithms ◽  
2021 ◽  
Vol 14 (7) ◽  
pp. 201
Author(s):  
Charlyn Nayve Villavicencio ◽  
Julio Jerison Escudero Macrohon ◽  
Xavier Alphonse Inbaraj ◽  
Jyh-Horng Jeng ◽  
Jer-Guang Hsieh

Early diagnosis is crucial to prevent the development of a disease that may cause danger to human lives. COVID-19, which is a contagious disease that has mutated into several variants, has become a global pandemic that demands to be diagnosed as soon as possible. With the use of technology, available information concerning COVID-19 increases each day, and extracting useful information from massive data can be done through data mining. In this study, authors utilized several supervised machine learning algorithms in building a model to analyze and predict the presence of COVID-19 using the COVID-19 Symptoms and Presence dataset from Kaggle. J48 Decision Tree, Random Forest, Support Vector Machine, K-Nearest Neighbors and Naïve Bayes algorithms were applied through WEKA machine learning software. Each model’s performance was evaluated using 10-fold cross validation and compared according to major accuracy measures, correctly or incorrectly classified instances, kappa, mean absolute error, and time taken to build the model. The results show that Support Vector Machine using Pearson VII universal kernel outweighs other algorithms by attaining 98.81% accuracy and a mean absolute error of 0.012.


Liver malady is an overall medical issue that is related with different inconveniences and high mortality. It is of basic significance that illness be recognized before such huge numbers of these lives can be spared. The phases of liver ailment are a significant viewpoint for focused treatment. It is a terribly troublesome undertaking for therapeutic analysts to foresee the disease inside the beginning times on account of sensitive manifestations. Generally the side effects become evident once it's past the point of no return. To beat this issue, we have liver infection forecast. Liver sickness might be distinguished with incalculable order systems, and these have been classified the utilization forecast of a number highlights and classifier blends. In this investigation, we applied five sort of classifiers that is Naïve Bayes, logistic regression, support vector machines, Random Forest, K Nearest Neighbour for the examination of liver malady. The classification exhibitions are assessed with 5 distinctive by and large execution measurements, i.e., precision, kappa, Mean absolute error (MAE), Root mean square error (RMSE), and F measures. The objective of this query work is to foresee liver infection with different machine learning and pick most efficient algorithm.


2020 ◽  
Vol 5 (3) ◽  
pp. 43-53
Author(s):  
Nor Hayati Binti Shafii ◽  
Rohana Alias ◽  
Nur Fithrinnissaa Zamani ◽  
Nur Fatihah Fauzi

Air pollution is a current monitored problem in areas with high population density such as big cities. Many regions in Malaysia are facing extreme air quality issues. This situation is caused by several factors such as human behavior, environmental awareness and technological development.  Accessing the air pollution index (API) accurately is very important to control its impact on environmental and human health.  The work presented here aims to access air pollution index of PM2.5 using Support Vector Machine (SVM) and to compare the accuracy of four different types of the kernel function in Support Vector Machine (SVM).  The data used is provided by the Department of Environment (DOE) and it is recorded from two Continuous Air Quality Monitoring Stations (CAQM) located at Tanah Merah and Kota Bharu. The results are analyzed using mean absolute error (MAE) and root mean squared error (RMSE). It is found that the proposed model using Radial Basis Function (RBF) with its parameters of cost and gamma equal to 100 can effectively and accurately forecast the air pollution index with Mean Absolute Error (MAE) and Root Mean Squared Error (RMSE) of 0.03868583 and 0.06251793 respectively for API in Kota Bharu and 0.03857308 (MAE) and 0.05895648 (RMSE) for API in Tanah Merah.


Sensors ◽  
2021 ◽  
Vol 21 (12) ◽  
pp. 3954
Author(s):  
Na Zhao ◽  
Zemeng Fan ◽  
Miaomiao Zhao

Dissolved oxygen (DO) is a direct indicator of water pollution and an important water quality parameter that affects aquatic life. Based on the fundamental theorem of surfaces in differential geometry, the present study proposes a new modeling approach to estimate DO concentrations with high accuracy by assessing the spatial correlation and heterogeneity of DO with respect to explanatory variables. Specifically, a regularization penalty term is integrated into the high-accuracy surface modeling (HASM) method by applying geographically weighted regression (GWR) with some covariates. A modified version of HASM, namely HASM_MOD, is illustrated through a case study of Poyang Lake, China, by comparing the results of HASM, a support vector machine (SVM), and cokriging. The results indicate that HASM_MOD yields the best performance, with a mean absolute error (MAE) that is 38%, 45%, and 42% lower than those of HASM, the SVM, and cokriging, respectively, by using the cross-validation method. The introduction of a regularization penalty term by using GWR with respect to covariates can effectively improve the quality of the DO estimates. The results also suggest that HASM_MOD is able to effectively estimate nonlinear and nonstationary time series and outperforms three other methods using cross-validation, with a root-mean-square error (RMSE) of 0.20 mg/L and R2 of 0.93 for the two study sites (Sanshan and Outlet_A stations). The proposed method, HASM_MOD, provides a new way to estimate the DO concentration with high accuracy.


2012 ◽  
Vol 212-213 ◽  
pp. 436-440
Author(s):  
Yan Zhao ◽  
Zeng Chuan Dong ◽  
Qing Hang Li

In this article, the Least Square Support Vector Machine(LS-SVM) regression analysis and prediction methods were briefly introduced. Radial basis kernel function was chosen and a streamflow forecast model based on the toolbox of Matlab was constructed. Then the model was validated with a case study. After the model validation with a case study, it could be seen that the prediction model shows high accuracy and convergence speed. According to the analysis of the results, it is feasible for LS-SVM utilization in streamflow forecast.


2018 ◽  
Vol 14 (2) ◽  
pp. 175
Author(s):  
Elly Indrayuni

Film merupakan subjek yang diminati oleh sejumlah besar orang diantara komunitas jaringan sosial yang memiliki perbedaan signifikan dalam pendapat atau sentimen mereka. Analisa sentimen atau opinion mining merupakan salah satu solusi mengatasi masalah untuk mengelompokan opini atau review menjadi opini positif atau negatif secara otomatis. Teknik yang digunakan dalam penelitian ini adalah Naive Bayes dan Support Vector Machines (SVM). Naive Bayes memiliki kelebihan yaitu sederhana, cepat dan memiliki akurasi yang tinggi. Sedangkan SVM  mampu mengidentifikasi hyperplane terpisah yang memaksimalkan margin antara dua kelas yang berbeda. Hasil klasifikasi sentimen pada penelitian ini terdiri dari dua label class, yaitu positif dan negatif. Nilai akurasi yang dihasilkan akan menjadi tolak  ukur untuk mencari model pengujian terbaik untuk kasus klasifikasi sentimen. Evaluasi dilakukan menggunakan 10 fold cross validation. Pengukuran akurasi diukur dengan confusion matrix dan kurva ROC. Hasil penelitian menunjukkan nilai akurasi untuk algoritma Naive Bayes sebesar 84.50%. Sedangkan nilai akurasi algoritma Support Vector Machine (SVM) lebih besar dari Naive Bayes yaitu sebesar 90.00%.


2011 ◽  
Vol 131 (8) ◽  
pp. 1495-1501
Author(s):  
Dongshik Kang ◽  
Masaki Higa ◽  
Hayao Miyagi ◽  
Ikugo Mitsui ◽  
Masanobu Fujita ◽  
...  

Sign in / Sign up

Export Citation Format

Share Document