scholarly journals A Markov chain-based feature extraction method for classification and identification of cancerous DNA sequences

Bioimpacts ◽  
2020 ◽  
Vol 11 (2) ◽  
pp. 87-99
Author(s):  
Amin Khodaei ◽  
Mohammad-Reza Feizi-Derakhshi ◽  
Behzad Mozaffari-Tazehkand

Introduction: In recent decades, the growing rate of cancer incidence is a big concern for most societies. Due to the genetic origins of cancer disease, its internal structure is necessary for the study of this disease. Methods: In this research, cancer data are analyzed based on DNA sequences. The transition probability of occurring two pairs of nucleotides in DNA sequences has Markovian property. This property inspires the idea of feature dimension reduction of DNA sequence for overcoming the high computational overhead of genes analysis. This idea is utilized in this research based on the Markovian property of DNA sequences. This mapping decreases feature dimensions and conserves basic properties for discrimination of cancerous and non-cancerous genes. Results: The results showed that a non-linear support vector machine (SVM) classifier with RBF and polynomial kernel functions can discriminate selected cancerous samples from non-cancerous ones. Experimental results based on the 10-fold cross-validation and accuracy metrics verified that the proposed method has low computational overhead and high accuracy. Conclusion: The proposed algorithm was successfully tested on related research case studies. In general, a combination of proposed Markovian-based feature reduction and non-linear SVM classifier can be considered as one of the best methods for discrimination of cancerous and non-cancerous genes.

Author(s):  
B. Yekkehkhany ◽  
A. Safari ◽  
S. Homayouni ◽  
M. Hasanlou

In this paper, a framework is developed based on Support Vector Machines (SVM) for crop classification using polarimetric features extracted from multi-temporal Synthetic Aperture Radar (SAR) imageries. The multi-temporal integration of data not only improves the overall retrieval accuracy but also provides more reliable estimates with respect to single-date data. Several kernel functions are employed and compared in this study for mapping the input space to higher Hilbert dimension space. These kernel functions include linear, polynomials and Radial Based Function (RBF). <br><br> The method is applied to several UAVSAR L-band SAR images acquired over an agricultural area near Winnipeg, Manitoba, Canada. In this research, the temporal alpha features of H/A/α decomposition method are used in classification. The experimental tests show an SVM classifier with RBF kernel for three dates of data increases the Overall Accuracy (OA) to up to 3% in comparison to using linear kernel function, and up to 1% in comparison to a 3rd degree polynomial kernel function.


2021 ◽  
pp. 33-42
Author(s):  
Zehai Xu ◽  
Haiyan Song ◽  
Zhiming Wu ◽  
Zefu Xu ◽  
Shifang Wang

The blurring of crop images acquired by agricultural Unmanned Aerial Vehicle (UAV) due to sudden inputs by operators, atmospheric disturbance, and many other factors will eventually affect the subsequent crop identification, information extraction, and yield estimation. Aiming at the above problems, the new proposed combined deblurring algorithm based on the re-weighted graph total variation (RGTV) and L0-regularized prior, and the other two representative deblurring algorithms were applied to restore blurry crop images acquired during UAV flight, respectively. The restoration performance was measured by subjective vision, and objective evaluation indexes. The crop shape-related and texture-related feature parameters were then extracted, the Support Vector Machine (SVM) classifier with four common kernel functions was implemented for crop classification to realize the purpose of crop information extraction. The deblurring results showed that the proposed algorithm performed better in suppressing the ringing effect and preserving the image fine details, and retained higher objective evaluation indexes than the other two deblurring algorithms. The comparative analysis of different classification kernel functions showed that the Polynomial kernel function with an average recognition rate of 94.83% was most suitable for crop classification and recognition. The research will help in further popularization of crop monitoring based on UAV low-altitude remote sensing.


Author(s):  
Suhas S ◽  
Dr. C. R. Venugopal

An enhanced classification system for classification of MR images using association of kernels with support vector machine is developed and presented in this paper along with the design and development of content-based image retrieval (CBIR) system. Content of image retrieval is the process of finding relevant image from large collection of image database using visual queries. Medical images have led to growth in large image collection. Oriented Rician Noise Reduction Anisotropic Diffusion filter is used for image denoising. A modified hybrid Otsu algorithm termed is used for image segmentation. The texture features are extracted using GLCM method. Genetic algorithm with Joint entropy is adopted for feature selection. The classification is done by support vector machine along with various kernels and the performance is validated. A classification accuracy of 98.83% is obtained using SVM with GRBF kernel. Various features have been extracted and these features are used to classify MR images into five different categories. Performance of the MC-SVM classifier is compared with different kernel functions. From the analysis and performance measures like classification accuracy, it is inferred that the brain and spinal cord MRI classification is best done using MC- SVM with Gaussian RBF kernel function than linear and polynomial kernel functions. The proposed system can provide best classification performance with high accuracy and low error rate.


2012 ◽  
Vol 22 (03) ◽  
pp. 1250011 ◽  
Author(s):  
U. RAJENDRA ACHARYA ◽  
S. VINITHA SREE ◽  
SUBHAGATA CHATTOPADHYAY ◽  
JASJIT S. SURI

Electroencephalogram (EEG) signals, which record the electrical activity in the brain, are useful for assessing the mental state of a person. Since these signals are nonlinear and non-stationary in nature, it is very difficult to decipher the useful information from them using conventional statistical and frequency domain methods. Hence, the application of nonlinear time series analysis to EEG signals could be useful to study the dynamical nature and variability of the brain signals. In this paper, we propose a Computer Aided Diagnostic (CAD) technique for the automated identification of normal and alcoholic EEG signals using nonlinear features. We first extract nonlinear features such as Approximate Entropy (ApEn), Largest Lyapunov Exponent (LLE), Sample Entropy (SampEn), and four other Higher Order Spectra (HOS) features, and then use them to train Support Vector Machine (SVM) classifier of varying kernel functions: 1st, 2nd, and 3rd order polynomials and a Radial basis function (RBF) kernel. Our results indicate that these nonlinear measures are good discriminators of normal and alcoholic EEG signals. The SVM classifier with a polynomial kernel of order 1 could distinguish the two classes with an accuracy of 91.7%, sensitivity of 90% and specificity of 93.3%. As a pre-analysis step, the EEG signals were tested for nonlinearity using surrogate data analysis and we found that there was a significant difference in the LLE measure of the actual data and the surrogate data.


2020 ◽  
Author(s):  
Nalika Ulapane ◽  
Karthick Thiyagarajan ◽  
sarath kodagoda

<div>Classification has become a vital task in modern machine learning and Artificial Intelligence applications, including smart sensing. Numerous machine learning techniques are available to perform classification. Similarly, numerous practices, such as feature selection (i.e., selection of a subset of descriptor variables that optimally describe the output), are available to improve classifier performance. In this paper, we consider the case of a given supervised learning classification task that has to be performed making use of continuous-valued features. It is assumed that an optimal subset of features has already been selected. Therefore, no further feature reduction, or feature addition, is to be carried out. Then, we attempt to improve the classification performance by passing the given feature set through a transformation that produces a new feature set which we have named the “Binary Spectrum”. Via a case study example done on some Pulsed Eddy Current sensor data captured from an infrastructure monitoring task, we demonstrate how the classification accuracy of a Support Vector Machine (SVM) classifier increases through the use of this Binary Spectrum feature, indicating the feature transformation’s potential for broader usage.</div><div><br></div>


2013 ◽  
Vol 2013 ◽  
pp. 1-7 ◽  
Author(s):  
Rakesh Patra ◽  
Sujan Kumar Saha

Support vector machine (SVM) is one of the popular machine learning techniques used in various text processing tasks including named entity recognition (NER). The performance of the SVM classifier largely depends on the appropriateness of the kernel function. In the last few years a number of task-specific kernel functions have been proposed and used in various text processing tasks, for example, string kernel, graph kernel, tree kernel and so on. So far very few efforts have been devoted to the development of NER task specific kernel. In the literature we found that the tree kernel has been used in NER task only for entity boundary detection or reannotation. The conventional tree kernel is unable to execute the complete NER task on its own. In this paper we have proposed a kernel function, motivated by the tree kernel, which is able to perform the complete NER task. To examine the effectiveness of the proposed kernel, we have applied the kernel function on the openly available JNLPBA 2004 data. Our kernel executes the complete NER task and achieves reasonable accuracy.


2020 ◽  
Vol 8 (6) ◽  
pp. 2613-2618

Among the most dangerous of cancers found in human beings, skin cancer is the prevalent one. These are of various forms. The most sporadic among them is melanoma. Early phase identification of melanoma will be helpful in curing it. Intensive skin exposure to UV radiation is the principal cause of melanoma. In this article, along with other techniques for extracting features (LDP [Local Directional Patterns], LBP [Local Binary Patterns], Convolutional Neural Networks [CNN]), we have used an SVM classifier for the study of melanoma skin photos. Such suggested algorithms are best graded when opposed to other recognition schemes. The LBP and LDP gives us means to extract features; these figures are subsequently used for identification of derived features from these methods or algorithms and classified or separated by the SVM (Support Vector Machine) classifier. For many of the classifications of melanoma skin images using these algorithms, we have accuracy nearly above 80 %, whereby the LBP system together with the SVM classifier was the most powerful attribute extraction tool of the three with their polynomial kernel type. Thus using this algorithm-classifier, the melanoma skin lesion images can be detected and diagnosed by the doctors in its early stage itself, resultantly, helping save lives.


2018 ◽  
Vol 8 (12) ◽  
pp. 2574 ◽  
Author(s):  
Qinghua Mao ◽  
Hongwei Ma ◽  
Xuhui Zhang ◽  
Guangming Zhang

Skewness Decision Tree Support Vector Machine (SDTSVM) algorithm is widely known as a supervised learning model for multi-class classification problems. However, the classification accuracy of the SDTSVM algorithm depends on the perfect selection of its parameters and the classification order. Therefore, an improved SDTSVM (ISDTSVM) algorithm is proposed in order to improve the classification accuracy of steel cord conveyor belt defects. In the proposed model, the classification order is determined by the sum of the Euclidean distances between multi-class sample centers and the parameters are optimized by the inertia weight Particle Swarm Optimization (PSO) algorithm. In order to verify the effectiveness of the ISDTSVM algorithm with different feature space, experiments were conducted on multiple UCI (University of California Irvine) data sets and steel cord conveyor belt defects using the proposed ISDTSVM algorithm and the conventional SDTSVM algorithm respectively. The average classification accuracies of five-fold cross-validation were obtained, based on two kinds of kernel functions respectively. For the Vowel, Zoo, and Wine data sets of the UCI data sets, as well as the steel cord conveyor belt defects, the ISDTSVM algorithm improved the classification accuracy by 3%, 3%, 1% and 4% respectively, compared to the SDTSVM algorithm. The classification accuracy of the radial basis function kernel were higher than the polynomial kernel. The results indicated that the proposed ISDTSVM algorithm improved the classification accuracy significantly, compared to the conventional SDTSVM algorithm.


2014 ◽  
Vol 543-547 ◽  
pp. 1659-1662
Author(s):  
Juan Du ◽  
Wen Long Zhang ◽  
Meng Meng Xie

The kernel was the key technology of SVM; the kernel affected the learning ability and generalization ability of support vector machine. Aiming at the specific application of harmful text information recognition, combining traditional kernel function the paper structured a new combination kernel, modeling for the independent harmful vocabulary and co-occur vocabularies, and then evaluation the linear kernel, homogeneous polynomial kernel, non homogeneous polynomial kernel and combination kernel function in the sample experiment. The experimental results of combination kernel function showed that the effect has increased greatly than other kernel functions for the application of harmful text information filtering. Especially the Rcall value achieved satisfactory results.


Symmetry ◽  
2020 ◽  
Vol 12 (4) ◽  
pp. 667
Author(s):  
Wismaji Sadewo ◽  
Zuherman Rustam ◽  
Hamidah Hamidah ◽  
Alifah Roudhoh Chusmarsyah

Early detection of pancreatic cancer is difficult, and thus many cases of pancreatic cancer are diagnosed late. When pancreatic cancer is detected, the cancer is usually well developed. Machine learning is an approach that is part of artificial intelligence and can detect pancreatic cancer early. This paper proposes a machine learning approach with the twin support vector machine (TWSVM) method as a new approach to detecting pancreatic cancer early. TWSVM aims to find two symmetry planes such that each plane has a distance close to one data class and as far as possible from another data class. TWSVM is fast in building a model and has good generalizations. However, TWSVM requires kernel functions to operate in the feature space. The kernel functions commonly used are the linear kernel, polynomial kernel, and radial basis function (RBF) kernel. This paper uses the TWSVM method with these kernels and compares the best kernel for use by TWSVM to detect pancreatic cancer early. In this paper, the TWSVM model with each kernel is evaluated using a 10-fold cross validation. The results obtained are that TWSVM based on the kernel is able to detect pancreatic cancer with good performance. However, the best kernel obtained is the RBF kernel, which produces an accuracy of 98%, a sensitivity of 97%, a specificity of 100%, and a running time of around 1.3408 s.


Sign in / Sign up

Export Citation Format

Share Document