Quantitative identification of coal texture using the support vector machine with geophysical logging data: A case study using medium-rank coal from the Panjiang, Guizhou, China

2020 ◽  
Vol 8 (4) ◽  
pp. T753-T762
Author(s):  
Zhenghui Xiao ◽  
Wei Jiang ◽  
Bin Sun ◽  
Yunjiang Cao ◽  
Lei Jiang ◽  
...  

Coal texture is important for predicting coal seam permeability and selecting favorable blocks for coalbed methane (CBM) exploration. Drilled cores and mining seam observations are the most direct and effective methods of identifying coal texture; however, they are expensive and cannot be used in unexplored coal seams. Geophysical logging has become a common method of coal texture identification, particularly during the CBM mining stage. However, quantitative methods for identifying coal texture based on geophysical logging data require further study. The support vector machine (SVM), a machine-learning method, has received great interest due to its remarkable generalization performance, and it has been used to quantitatively identify hard and soft coal using geophysical logging data. In this study, four well-logging curves, the acoustic time difference (AC), caliper log (CAL), density (DEN), and natural gamma (GR), were used for coal texture analysis. Hard coal (undeformed and cataclastic coal) exhibited higher DEN, GR, lower CAL, and lower AC than soft coal. The accuracy rate of coal texture identification was highest (97%) when the linear kernel function was applied, and the maximum training accuracy rate was achieved when the penalty parameter value of the linear kernel increased to 1. The results of verification with a newly cored CBM exploration well indicated that the SVM-based identification method was effective for coal texture analysis. With the increasing availability of data, this method can be used to distinguish hard and soft coal in a coal-bearing basin under numerous sample learning conditions.

Risks ◽  
2020 ◽  
Vol 8 (2) ◽  
pp. 52
Author(s):  
Santosh Kumar Shrivastav ◽  
P. Janaki Ramudu

Banks play a vital role in strengthening the financial system of a country; hence, their survival is decisive for the stability of national economies. Therefore, analyzing the survival probability of the banks is an essential and continuing research activity. However, the current literature available indicates that research is currently limited on banks’ stress quantification in countries like India where there have been fewer failed banks. The literature also indicates a lack of scientific and quantitative approaches that can be used to predict bank survival and failure probabilities. Against this backdrop, the present study attempts to establish a bankruptcy prediction model using a machine learning approach and to compute and compare the financial stress that the banks face. The study uses the data of failed and surviving private and public sector banks in India for the period January 2000 through December 2017. The explanatory features of bank failure are chosen by using a two-step feature selection technique. First, a relief algorithm is used for primary screening of useful features, and in the second step, important features are fed into the support vector machine to create a forecasting model. The threshold values of the features for the decision boundary which separates failed banks from survival banks are calculated using the decision boundary of the support vector machine with a linear kernel. The results reveal, inter alia, that support vector machine with linear kernel shows 92.86% forecasting accuracy, while a support vector machine with radial basis function kernel shows 71.43% accuracy. The study helps to carry out comparative analyses of financial stress of the banks and has significant implications for their decisions of various stakeholders such as shareholders, management of the banks, analysts, and policymakers.


2020 ◽  
Vol 2020 ◽  
pp. 1-14
Author(s):  
Gaoteng Yuan ◽  
Yihui Liu ◽  
Wei Huang ◽  
Bing Hu

Purpose. The objective of this study is to investigate the use of texture analysis (TA) of magnetic resonance image (MRI) enhanced scan and machine learning methods for distinguishing different grades in breast invasive ductal carcinoma (IDC). Preoperative prediction of the grade of IDC can provide reference for different clinical treatments, so it has important practice values in clinic. Methods. Firstly, a breast cancer segmentation model based on discrete wavelet transform (DWT) and K-means algorithm is proposed. Secondly, TA was performed and the Gabor wavelet analysis is used to extract the texture feature of an MRI tumor. Then, according to the distance relationship between the features, key features are sorted and feature subsets are selected. Finally, the feature subset is classified by using a support vector machine and adjusted parameters to achieve the best classification effect. Results. By selecting key features for classification prediction, the classification accuracy of the classification model can reach 81.33%. 3-, 4-, and 5-fold cross-validation of the prediction accuracy of the support vector machine model is 77.79%~81.94%. Conclusion. The pathological grading of IDC can be predicted and evaluated by texture analysis and feature extraction of breast tumors. This method can provide much valuable information for doctors’ clinical diagnosis. With further development, the model demonstrates high potential for practical clinical use.


2020 ◽  
Vol 7 (1) ◽  
pp. 53
Author(s):  
Derisma Derisma ◽  
Fajri Febrian

Abstrak: Kanker payudara merupakan jenis kanker yang sering ditemukan oleh kebanyakan wanita. Di Indonesia Kanker payudara menempati urutan pertama pada pasien rawat inap di seluruh rumah sakit. Tujuan dari penelitian ini adalah melakukan diagnosis penyakit kanker payudara berbasis komputasi yang dapat menghasilkan bagaimana kondisi kanker seseorang berdasarkan akurasi algoritma. Penelitian ini menggunakan pemrograman orange python dan dataset Wisconsin Breast Cancer untuk pemodelan klasifikasi kanker payudara. Metode data mining yang diterapkan yaitu Neural Network, Support Vector Machine, dan Naive Bayes. Dalam penelitian ini didapat algoritma klasifikasi terbaik yaitu algoritma Kernel SVM dengan tingkat akurasi sebesar  98.9 % dan algoritma terendah yaitu Naive Bayes senilai 96.1 %.   Kata kunci: kanker payudara, neural network, support vector machine, naive bayes   Abstract: Breast cancer is a type of cancer that mostly found in many women. In Indonesia, breast cancer ranks first in hospitalized patients at every hospital. This study aimed to conduct a computation-based diagnose of breast cancer disease that could produce the state of cancer of an individual based on the accuracy of algorithm. This study used python orange programming and Wisconsin Breast Cancer dataset for a modeling and application of breast cancer classification. The data mining methods that were applied in this study were Neural Network, Support Vector Machine, dan Naive Bayes. In this study, Kernel SVM’s algorithm was the best classification algorithm of breast cancer disease with 98.9 % accuracy rate and Naïve Beyes was the lowest with 96.1 % of accuracy rate.   Keywords: breast cancer, neural network, support vector machine, naive bayes


Author(s):  
Daniel Febrian Sengkey ◽  
Agustinus Jacobus ◽  
Fabian Johanes Manoppo

Support vector machine (SVM) is a known method for supervised learning in sentiment analysis and there are many studies about the use of SVM in classifying the sentiments in lecturer evaluation. SVM has various parameters that can be tuned and kernels that can be chosen to improve the classifier accuracy. However, not all options have been explored. Therefore, in this study we compared the four SVM kernels: radial, linear, polynomial, and sigmoid, to discover how each kernel influences the accuracy of the classifier. To make a proper assessment, we used our labeled dataset of students’ evaluations toward the lecturer. The dataset was split, one for training the classifier, and another one for testing the model. As an addition, we also used several different ratios of the training:testing dataset. The split ratios are 0.5 to 0.95, with the increment factor of 0.05. The dataset was split randomly, hence the splitting-training-testing processes were repeated 1,000 times for each kernel and splitting ratio. Therefore, at the end of the experiment, we got 40,000 accuracy data. Later, we applied statistical methods to see whether the differences are significant. Based on the statistical test, we found that in this particular case, the linear kernel significantly has higher accuracy compared to the other kernels. However, there is a tradeoff, where the results are getting more varied with a higher proportion of data used for training.


2012 ◽  
Vol 5 (1) ◽  
pp. 33 ◽  
Author(s):  
Rama Adhitia ◽  
Ayu Purwarianti

Paper ini mengkaji sebuah solusi untuk permasalahan penilaian jawaban esai secara otomatis dengan menggabungkan support vector machine (SVM) sebagai teknik klasifikasi teks otomatis dengan LSA sebagai usaha untuk menangani sinonim dan polisemi antar index term. Berbeda dengan sistem penilaian esai yang biasa yakni fitur yang digunakan berupa index term, fitur yang digunakan proses penilaian jawaban esai adalah berupa fitur generic yang memungkinkan pengujian model penilaian esai untuk berbagai pertanyaan yang berbeda. Dengan menggunakan fitur generic ini, seseorang tidak perlu melakukan pelatihan ulang jika orang tersebut akan melakukan penilaian esai jawaban untuk beberapa pertanyaan. Fitur yang dimaksud meliputi persentase kemunculan kata kunci, similarity jawaban esai dengan jawaban referensi, persentase kemunculan gagasan kunci, persentase kemunculan gagasan salah, serta persentase kemunculan sinonim kata kunci. Hasil pengujian juga memperlihatkan bahwa metode yang diusulkan mempunyai tingkat akurasi penilaian yang lebih tinggi jika dibandingkan dengan metode lain seperti SVM atau LSA menggunakan index term sebagai fitur pembelajaran mesin. This paper examines a solution for problems of assessment an essay answers automatically by combining support vector machine (SVM) as automatic text classification techniques and LSA as an attempt to deal with synonyms and the polysemy between index terms. Unlike the usual essay scoring system that used index terms features, the feature used for the essay answers assessment process is a generic feature which allows testing of valuation models essays for a variety of different questions. By using these generic features, one does not need to re training if the person will conduct an assessment essay answers to some questions. The features include percentage of keywords, similarity essay answers with the answer reference, percentage of key ideas, percentage of wrong answer, and percentage of keyword synonyms. The test results also show that the proposed method has a higher valuation accuracy rate compared to other methods such as SVM or LSA, use term index as features in machine learning.


Author(s):  
Nor Ain Maisarah Samsudin, Et. al.

This study proposed a statistical investigate the pattern of students’ academic performance before and after online learning due to the Movement Control Order (MCO) during pandemic outbreak and a modelling students’ academic performance based on classification in Support Vector Machine (SVM). Data sample were taken from undergraduate students of Faculty of Science and Mathematics, Universiti Pendidikan Sultan Idris (UPSI). Student’s Grade Point Average (GPA) were obtained to developed model of academic performances during Covid-19 outbreak. The prediction model was used to predict the academic performances of university students when online classes was conducted. The algorithm of Support Vector Machine (SVM) was used to develop a model of students’ academic performance in university. For the Support Vector Machine (SVM) algorithm, there are two important parameters which are C (misclassification tolerance parameter) and epsilon  need to identify before proceed the further analysis. The parameters was applied to four different types of kernel which is linear kernel, radial basis function kernel, polynomial kernel and sigmoid kernel and the result was found that the best accuracy achieved by SVM are 73.68% by using linear kernel and the worst accuracy obtained from a sigmoid kernel which is 67.99% with parameter of misclassification tolerance C is 128 and epsilon is 0.6.


Sign in / Sign up

Export Citation Format

Share Document