Watershed Prioritization and Decision Making Based On Weighted Sum Analysis, Feature Ranking and Machine Learning Techniques.

Abstract Prediction and validation of Compound factors for prioritization of watersheds is an essential application using Machine Learning (ML) Techniques in water resources engineering. In the current paper, a method is proposed to derive 14 morphometric and 3 Topo-hydrological parameters using Remote Sensing (RS) and Geographical Information System (GIS), whereas prediction and validation of compound factor using ML techniques. Compound factor (CF) values are calculated using Weighted Sum Analysis (WSA), ReliefF, correlation coefficient techniques. A ten-fold cross-validation technique is applied to two machine learning models Multi-Layer Perceptron (MLP) and Support Vector Machine (SVM). Predication accuracy of models has been further achieved by feature ranking. The accuracy of ML models is evaluated with three parameters, Mean Absolute Error (MEA), Correlation Coefficient (CC), and Root Mean Square Error (RMSE). With the ranked features and Ten-fold cross-validation, prediction results were found to be better. The methodology will be useful for the accurate prediction of CF values and to reduce the uncertainty in watershed prioritization for conservation techniques for soil and water.

Download Full-text

Analisis Sentimen Pada Maskapai Penerbangan di Platform Twitter Menggunakan Algoritma Support Vector Machine (SVM)

Teknika ◽

10.34148/teknika.v10i1.311 ◽

2021 ◽

Vol 10 (1) ◽

pp. 18-26

Author(s):

Hendry Cipta Husada ◽

Adi Suryaputra Paramita

Keyword(s):

Machine Learning ◽

Social Media ◽

Support Vector Machine ◽

Cross Validation ◽

Support Vector ◽

Learning Approach ◽

Social Media Platform ◽

Machine Learning Approach ◽

Media Platform ◽

Fold Cross Validation

Perkembangan teknologi saat ini telah memberikan kemudahan bagi banyak orang dalam mendapatkan dan menyebarkan informasi di berbagai social media platform. Twitter merupakan salah satu media yang kerap digunakan untuk menyampaikan opini sebagai bentuk reaksi seseorang atas suatu hal. Opini yang terdapat di Twitter dapat digunakan perusahaan maskapai penerbangan sebagai parameter kunci untuk mengetahui tingkat kepuasan publik sekaligus bahan evaluasi bagi perusahaan. Berdasarkan hal tersebut, diperlukan sebuah metode yang dapat secara otomatis melakukan klasifikasi opini ke dalam kategori positif, negatif, atau netral melalui proses analisis sentimen. Proses analisis sentimen dilakukan dengan proses data preprocessing, pembobotan kata menggunakan metode TF-IDF, penerapan algoritma, dan pembahasan atas hasil klasifikasi. Klasifikasi opini dilakukan dengan machine learning approach memanfaatkan algoritma multi-class Support Vector Machine (SVM). Data yang digunakan dalam penelitian ini adalah opini dalam bahasa Inggris dari para pengguna Twitter terhadap maskapai penerbangan. Berdasarkan pengujian yang telah dilakukan, hasil klasifikasi terbaik diperoleh menggunakan SVM kernel RBF pada nilai parameter 𝐶(complexity) = 10 dan 𝛾(gamma) = 1, dengan nilai accuracy sebesar 84,37% dan 80,41% ketika menggunakan 10-fold cross validation.

Download Full-text

Perbandingan Akurasi dan Waktu Proses Algoritma K-NN dan SVM dalam Analisis Sentimen Twitter

Jurnal Informatika ◽

10.31311/ji.v6i2.5129 ◽

2019 ◽

Vol 6 (2) ◽

pp. 226-235

Author(s):

Muhammad Rangga Aziz Nasution ◽

Mardhiya Hayaty

Keyword(s):

Machine Learning ◽

Support Vector Machine ◽

Unsupervised Learning ◽

Supervised Learning ◽

Cross Validation ◽

Nearest Neighbor ◽

Support Vector ◽

K Nearest Neighbor ◽

Fold Cross Validation

Salah satu cabang ilmu komputer yaitu pembelajaran mesin (machine learning) menjadi tren dalam beberapa waktu terakhir. Pembelajaran mesin bekerja dengan memanfaatkan data dan algoritma untuk membuat model dengan pola dari kumpulan data tersebut. Selain itu, pembelajaran mesin juga mempelajari bagaimama model yang telah dibuat dapat memprediksi keluaran (output) berdasarkan pola yang ada. Terdapat dua jenis metode pembelajaran mesin yang dapat digunakan untuk analisis sentimen: supervised learning dan unsupervised learning. Penelitian ini akan membandingkan dua algoritma klasifikasi yang termasuk dari supervised learning: algoritma K-Nearest Neighbor dan Support Vector Machine, dengan cara membuat model dari masing-masing algoritma dengan objek teks sentimen. Perbandingan dilakukan untuk mengetahui algoritma mana lebih baik dalam segi akurasi dan waktu proses. Hasil pada perhitungan akurasi menunjukkan bahwa metode Support Vector Machine lebih unggul dengan nilai 89,70% tanpa K-Fold Cross Validation dan 88,76% dengan K-Fold Cross Validation. Sedangkan pada perhitungan waktu proses metode K-Nearest Neighbor lebih unggul dengan waktu proses 0.0160s tanpa K-Fold Cross Validation dan 0.1505s dengan K-Fold Cross Validation.

Download Full-text

Prediksi Waktu Kedatangan Pelanggan Servis Kendaraan Bermotor Berdasarkan Data Historis menggunakan Support Vector Machine

Jurnal Edukasi dan Penelitian Informatika (JEPIN) ◽

10.26418/jp.v7i1.42964 ◽

2021 ◽

Vol 7 (1) ◽

pp. 25

Author(s):

Benni Agung Nugroho ◽

Andika Kurnia Adi Pradana ◽

Ellya Nurfarida

Keyword(s):

Machine Learning ◽

Support Vector Machine ◽

Cross Validation ◽

Support Vector ◽

Fold Cross Validation

Dealer kendaraan perlu menjaga hubungan baik dengan pelanggan sehingga inti bisnis dealer dapat berlanjut dan berkembang. Salah satu strategi yang digunakan adalah memprediksi kapan pelanggan akan berkunjung lagi untuk servis kendaraan (layanan perawatan atau perbaikan kendaraan) berdasarkan analisis data riwayat kunjungan pelanggan. Dengan hasil prediksi berupa hari kedatangan pelanggan dimasa depan maka dealer kendaraan dapat mengingatkan pelanggan tentang kapan waktunya servis kendaraan. Support vector machine (SVM) adalah sebuah model pembelajaran mesin (machine learning) yang menggunakan hyperplane dan support-vector untuk memisahkan kelas dalam suatu ruang dimensi secara optimal sehingga sesuai untuk digunakan dalam pemecahan masalah prediksi waktu kedatangan pelanggan. SVM diimplementasikan untuk memprediksi kapan pelanggan akan datang lagi dimasa depan untuk perbaikan atau perawatan kendaraan. Hasil menunjukkan bahwa, dengan pemilihan metode yang tepat, SVM dapat memprediksi waktu kedatangan pelanggan dengan tingkat akurasi mencapai 92.5% berdasarkan validasi K-Fold cross-validation pada data latih dan mencapai rata-rata 97.33% untuk pengukuran nilai presisi, akurasi dan recall pada data uji

Download Full-text

Research of Machine Learning algorithms using K-fold cross validation

International Journal of Engineering and Advanced Technology - Regular Issue ◽

10.35940/ijeat.f1043.0886s19 ◽

2019 ◽

Vol 8 (6S) ◽

pp. 215-218

Keyword(s):

Machine Learning ◽

Cross Validation ◽

Research Area ◽

Machine Learning Algorithms ◽

Support Vector ◽

Breast Cancer Dataset ◽

Cancer Dataset ◽

Validation Data ◽

Machine Learning Classification ◽

Fold Cross Validation

In machine learning, Classification is one of the most important research area. Classification allocates the given input to a known category. In this paper different machine algorithms like Logistic regression (LR), Decision tree (DT), Support vector machine (SVM), K nearest neighbors (KNN) were implemented on UCI breast cancer dataset with preprocessing. The models were trained and tested with k-fold cross validation data. Accuracy and run time execution of each classifier are implemented in python.

Download Full-text

Special Issue on Using Machine Learning Algorithms in the Prediction of Kyphosis Disease: A Comparative Study

Applied Sciences ◽

10.3390/app9163322 ◽

2019 ◽

Vol 9 (16) ◽

pp. 3322 ◽

Cited By ~ 2

Author(s):

Stephen Dankwa ◽

Wenfeng Zheng

Keyword(s):

Machine Learning ◽

Support Vector Machines ◽

Cross Validation ◽

Machine Learning Algorithms ◽

Support Vector ◽

Grid Search ◽

Baseline Model ◽

Vector Machines ◽

Ann Models ◽

Fold Cross Validation

Machine learning (ML) is the technology that allows a computer system to learn from the environment, through re-iterative processes, and improve itself from experience. Recently, machine learning has gained massive attention across numerous fields, and is making it easy to model data extremely well, without the importance of using strong assumptions about the modeled system. The rise of machine learning has proven to better describe data as a result of providing both engineering solutions and an important benchmark. Therefore, in this current research work, we applied three different machine learning algorithms, which were, the Random Forest (RF), Support Vector Machines (SVM), and Artificial Neural Network (ANN) to predict kyphosis disease based on a biomedical data. At the initial stage of the experiments, we performed 5- and 10-Fold Cross-Validation using Logistic Regression as a baseline model to compare with our ML models without performing grid search. We then evaluated the models and compared their performances based on 5- and 10-Fold Cross-Validation after running grid search algorithms on the ML models. Among the Support Vector Machines, we experimented with the three kernels (Linear, Radial Basis Function (RBF), Polynomial). We observed overall accuracies of the models between 79%–85%, and 77%–86% based on the 5- and 10-Fold Cross-Validation, after running grid search respectively. Based on the 5- and 10-Fold Cross-Validation as evaluation metrics, the RF, SVM-RBF, and ANN models achieved accuracies more than 80%. The RF, SVM-RBF and ANN models outperformed the baseline model based on the 10-Fold Cross-Validation with grid search. Overall, in terms of accuracies, the ANN model outperformed all the other ML models, achieving 85.19% and 86.42% based on the 5- and 10-Fold Cross-Validation. We proposed that RF, SVM-RBF and ANN models should be used to detect and predict kyphosis disease after a patient had undergone surgery or operation. We suggest that machine learning should be adopted and used as an essential and critical tool across the maximum spectrum of answering biomedical questions.

Download Full-text

Computation of High-Performance Concrete Compressive Strength Using Standalone and Ensembled Machine Learning Techniques

Materials ◽

10.3390/ma14227034 ◽

2021 ◽

Vol 14 (22) ◽

pp. 7034

Author(s):

Yue Xu ◽

Waqas Ahmad ◽

Ayaz Ahmad ◽

Krzysztof Adam Ostrowski ◽

Marta Dudek ◽

...

Keyword(s):

Machine Learning ◽

Random Forest ◽

Support Vector Regression ◽

High Performance ◽

Cross Validation ◽

High Performance Concrete ◽

Machine Learning Techniques ◽

Support Vector ◽

Learning Techniques ◽

Fold Cross Validation

The current trend in modern research revolves around novel techniques that can predict the characteristics of materials without consuming time, effort, and experimental costs. The adaptation of machine learning techniques to compute the various properties of materials is gaining more attention. This study aims to use both standalone and ensemble machine learning techniques to forecast the 28-day compressive strength of high-performance concrete. One standalone technique (support vector regression (SVR)) and two ensemble techniques (AdaBoost and random forest) were applied for this purpose. To validate the performance of each technique, coefficient of determination (R2), statistical, and k-fold cross-validation checks were used. Additionally, the contribution of input parameters towards the prediction of results was determined by applying sensitivity analysis. It was proven that all the techniques employed showed improved performance in predicting the outcomes. The random forest model was the most accurate, with an R2 value of 0.93, compared to the support vector regression and AdaBoost models, with R2 values of 0.83 and 0.90, respectively. In addition, statistical and k-fold cross-validation checks validated the random forest model as the best performer based on lower error values. However, the prediction performance of the support vector regression and AdaBoost models was also within an acceptable range. This shows that novel machine learning techniques can be used to predict the mechanical properties of high-performance concrete.

Download Full-text

K-MEANS SEBAGAI EKSTRAKTOR CIRI PADA KLASIFIKASI DATA DENGAN ALGORITMA SUPPORT VECTOR MACHINE (SVM)

Simetris Jurnal Teknik Mesin Elektro dan Ilmu Komputer ◽

10.24176/simet.v9i2.2433 ◽

2018 ◽

Vol 9 (2) ◽

pp. 889-896

Author(s):

Nurul Chamidah

Keyword(s):

Breast Cancer ◽

Machine Learning ◽

Support Vector Machine ◽

Heart Disease ◽

Membership Function ◽

Cross Validation ◽

Fuzzy Membership ◽

Fuzzy Membership Function ◽

Support Vector ◽

Fold Cross Validation

Besarnya dimensi pada ciri merupakan masalah pada komputasi untuk mengklasifikasi data sehingga diperlukan suatu proses ekstraksi ciri agar dimensinya berkurang dengan cara mengambil hanya informasi yang penting dari ciri. Penelitian ini menggunakan algoritma K-Means untuk mengekstraksi ciri dengan menemukan pola tersembunyi dari setiap kelas kemudian direkonstruksi dengan fuzzy membership function dan mendapatkan pola baru. Pola baru yang terbentuk digunakan sebagai ciri abstrak dan dibagi kedalam data latih dan data uji. Pelatihan dilakukan dengan memanfaatkan algoritma Support Vector Machine (SVM) untuk mendapatkan model klasifikasi. Model klasifikasi SVM yang diperoleh kemudian di uji dengan menggunakan data uji untuk memperoleh performa klasifikasi berupa akurasi dan waktu komputasi. Dengan 5-fold cross validation, metode ini memberikan akurasi yang baik pada dataset Liver, Breast Cancer dan Heart Disease yang diperoleh dari UCI Machine Learning Repository. Penelitian ini menunjukkan kemampuan K-Means untuk mengekstraksi ciri dari dataset. Hasil penelitian ini menujukkan bahwa K-Means sebagai ekstraktor ciri dapat mengurangi waktu komputasi.

Download Full-text

Machine learning algorithm for improving performance on 3 AQ-screening classification

Communications in Science and Technology ◽

10.21924/cst.4.2.2019.118 ◽

2019 ◽

Vol 4 (2) ◽

pp. 44-49

Author(s):

Taftazani Ghazi Pratama ◽

Rudy Hartanto ◽

Noor Akhmad Setiawan

Keyword(s):

Machine Learning ◽

Cross Validation ◽

Learning Algorithm ◽

Autism Spectrum ◽

Support Vector ◽

Autism Spectrum Quotient ◽

Machine Learning Algorithm ◽

Study Support ◽

Artificial Neural Network Ann ◽

Fold Cross Validation

Autism Spectrum Disorder (ASD) classification using machine learning can help parents, caregivers, psychiatrists, and patients to obtain the results of early detection of ASD. In this study, the dataset used is the autism-spectrum quotient for child, adolescent and adult, namely AQ-child, AQ-adolescent, AQ-adult. This study aims to improve the sensitivity and specificity of previous studies so that the classification results of ASD are better characterized by the reduced misclassification. The algorithm applied in this study: support vector machine (SVM), random forest (RF), artificial neural network (ANN). The evaluation results using 10-fold cross validation showed that RF succeeded in producing higher adult AQ sensitivity, which was 87.89%. The increase in the specificity level of AQ-Adolescents is better produced using an SVM of 86.33%.

Download Full-text

Hospital Facebook Reviews Analysis Using a Machine Learning Sentiment Analyzer and Quality Classifier

Healthcare ◽

10.3390/healthcare9121679 ◽

2021 ◽

Vol 9 (12) ◽

pp. 1679

Author(s):

Afiq Izzudin A. Rahim ◽

Mohd Ismail Ibrahim ◽

Sook-Ling Chua ◽

Kamarul Imran Musa

Keyword(s):

Machine Learning ◽

Social Media ◽

Sentiment Analysis ◽

Cross Validation ◽

Healthcare Providers ◽

Hospital Quality ◽

Public Hospitals ◽

Learning System ◽

Support Vector ◽

Fold Cross Validation

While experts have recognised the significance and necessity of social media integration in healthcare, no systematic method has been devised in Malaysia or Southeast Asia to include social media input into the hospital quality improvement process. The goal of this work is to explain how to develop a machine learning system for classifying Facebook reviews of public hospitals in Malaysia by using service quality (SERVQUAL) dimensions and sentiment analysis. We developed a Machine Learning Quality Classifier (MLQC) based on the SERVQUAL model and a Machine Learning Sentiment Analyzer (MLSA) by manually annotated multiple batches of randomly chosen reviews. Logistic regression (LR), naive Bayes (NB), support vector machine (SVM), and other methods were used to train the classifiers. The performance of each classifier was tested using 5-fold cross validation. For topic classification, the average F1-score was between 0.687 and 0.757 for all models. In a 5-fold cross validation of each SERVQUAL dimension and in sentiment analysis, SVM consistently outperformed other methods. The study demonstrates how to use supervised learning to automatically identify SERVQUAL domains and sentiments from patient experiences on a hospital’s Facebook page. Malaysian healthcare providers can gather and assess data on patient care via the use of these content analysis technology to improve hospital quality of care.

Download Full-text

Combination of Support Vector Machine and K-Fold cross-validation for prediction of long-term degradation of the compressive strength of marine concrete

International Journal of Computational Physics Series ◽

10.29167/a1i1p120-130 ◽

2018 ◽

Vol 1 (1) ◽

pp. 120-130 ◽

Cited By ~ 1

Author(s):

Chunxiang Qian ◽

Wence Kang ◽

Hao Ling ◽

Hua Dong ◽

Chengyao Liang ◽

...

Keyword(s):

Support Vector Machine ◽

Environmental Factors ◽

Cross Validation ◽

Concrete Strength ◽

Simulation Method ◽

Support Vector ◽

Svm Model ◽

Artificial Neural Network Ann ◽

Influence Degree ◽

Fold Cross Validation

Support Vector Machine (SVM) model optimized by K-Fold cross-validation was built to predict and evaluate the degradation of concrete strength in a complicated marine environment. Meanwhile, several mathematical models, such as Artificial Neural Network (ANN) and Decision Tree (DT), were also built and compared with SVM to determine which one could make the most accurate predictions. The material factors and environmental factors that influence the results were considered. The materials factors mainly involved the original concrete strength, the amount of cement replaced by fly ash and slag. The environmental factors consisted of the concentration of Mg2+, SO42-, Cl-, temperature and exposing time. It was concluded from the prediction results that the optimized SVM model appeared to perform better than other models in predicting the concrete strength. Based on SVM model, a simulation method of variables limitation was used to determine the sensitivity of various factors and the influence degree of these factors on the degradation of concrete strength.

Download Full-text