A Hybrid Feature Selection and Extraction Methods for Sleep Apnea Detection Using Bio-Signals

People with sleep apnea (SA) are at increased risk of having stroke and cardiovascular diseases. Polysomnography (PSG) is used to detect SA. This paper conducts feature selection from PSG signals and uses a support vector machine (SVM) to detect SA. To analyze SA, the Physionet Apnea Database was used to obtain various features. Electrocardiography (ECG), oxygen saturation (SaO2), airflow, abdominal, and thoracic signals were used to provide various frequency-, time-domain and non-linear features (n = 87). To analyse the significance of these features, firstly, two evaluation measures, the rank-sum method and the analysis of variance (ANOVA) were used to evaluate the significance of the features. These features were then classified according to their significance. Finally, different class feature sets were presented as inputs for an SVM classifier to detect the onset of SA. The hill-climbing feature selection algorithm and the k-fold cross-validation method were applied to evaluate each classification performance. Through the experiments, we discovered that the best feature set (including the top-five significant features) obtained the best classification performance. Furthermore, we plotted receiver operating characteristic (ROC) curves to examine the performance of the SVM, and the results showed the SVM with Linear kernel (regularization parameter = 1) outperformed other classifiers (area under curve = 95.23%, sensitivity = 94.29%, specificity = 96.17%). The results confirm that feature subsets based on multiple bio-signals have the potential to identify patients with SA. The use of a smaller subset avoids dimensionality problems and reduces the computational load.

Download Full-text

ENHANCED SVM CLASSIFIER FOR BREAST CANCER DIAGNOSIS

International Journal of Engineering Technologies and Management Research ◽

10.29121/ijetmr.v5.i3.2018.178 ◽

2020 ◽

Vol 5 (3) ◽

pp. 67-74 ◽

Cited By ~ 1

Author(s):

M. Kavitha ◽

G. Lavanya ◽

J. Janani ◽

Balaji. J

Keyword(s):

Breast Cancer ◽

Feature Selection ◽

Cancer Diagnosis ◽

Breast Cancer Diagnosis ◽

Classification Performance ◽

Supervised Machine Learning ◽

Basis Set ◽

Support Vector ◽

Svm Classifier ◽

Breast Image

Breast cancer is the leading disease to cause death especially in women. In this paper, a frame work based algorithm for the classification of cancerous/non-cancerous data is developed using application of supervised machine learning. In feature selection, we derive basis set in the kernel space and then we extend the margin based feature selection algorithm. We are trying to explore several feature selection, extraction techniques and combine the optimal feature subsets with various learning classification methods such as KNN, PNN and Support Vector Machine (SVM) classifiers. The best classification performance for breast cancer diagnosis is attained equal to 99.17% between radius and compact features using SVM classifier. And also derive the features of a breast image in the WBCD dataset.

Download Full-text

Binary Spectrum Feature for Improved Classiﬁer Performance

10.36227/techrxiv.12993122 ◽

2020 ◽

Author(s):

Nalika Ulapane ◽

Karthick Thiyagarajan ◽

sarath kodagoda

Keyword(s):

Machine Learning ◽

Classification Performance ◽

Feature Reduction ◽

Sensor Data ◽

Machine Learning Techniques ◽

Support Vector ◽

Svm Classifier ◽

Monitoring Task ◽

Classifier Performance ◽

Spectrum Feature

<div>Classiﬁcation has become a vital task in modern machine learning and Artiﬁcial Intelligence applications, including smart sensing. Numerous machine learning techniques are available to perform classiﬁcation. Similarly, numerous practices, such as feature selection (i.e., selection of a subset of descriptor variables that optimally describe the output), are available to improve classiﬁer performance. In this paper, we consider the case of a given supervised learning classiﬁcation task that has to be performed making use of continuous-valued features. It is assumed that an optimal subset of features has already been selected. Therefore, no further feature reduction, or feature addition, is to be carried out. Then, we attempt to improve the classiﬁcation performance by passing the given feature set through a transformation that produces a new feature set which we have named the “Binary Spectrum”. Via a case study example done on some Pulsed Eddy Current sensor data captured from an infrastructure monitoring task, we demonstrate how the classiﬁcation accuracy of a Support Vector Machine (SVM) classiﬁer increases through the use of this Binary Spectrum feature, indicating the feature transformation’s potential for broader usage.</div><div><br></div>

Download Full-text

Identification of Chronic Hypersensitivity Pneumonitis Biomarkers with Machine Learning and Differential Co-expression Analysis

Current Gene Therapy ◽

10.2174/1566523220666201208093325 ◽

2020 ◽

Vol 20 ◽

Author(s):

Hongwei Zhang ◽

Steven Wang ◽

Tao Huang

Keyword(s):

Feature Selection ◽

Expression Analysis ◽

Hypersensitivity Pneumonitis ◽

Enrichment Analysis ◽

Functional Enrichment ◽

Great Promise ◽

Support Vector ◽

Svm Classifier ◽

Clinical Tool ◽

Chronic Hypersensitivity Pneumonitis

Aims: We would like to identify the biomarkers for chronic hypersensitivity pneumonitis (CHP) and facilitate the precise gene therapy of CHP. Background: Chronic hypersensitivity pneumonitis (CHP) is an interstitial lung disease caused by hypersensitive reactions to inhaled antigens. Clinically, the tasks of differentiating between CHP and other interstitial lungs diseases, especially idiopathic pulmonary fibrosis (IPF), were challenging. Objective: In this study, we analyzed the public available gene expression profile of 82 CHP patients, 103 IPF patients, and 103 control samples to identify the CHP biomarkers. Method: The CHP biomarkers were selected with advanced feature selection methods: Monte Carlo Feature Selection (MCFS) and Incremental Feature Selection (IFS). A Support Vector Machine (SVM) classifier was built. Then, we analyzed these CHP biomarkers through functional enrichment analysis and differential co-expression analysis. Result: There were 674 identified CHP biomarkers. The co-expression network of these biomarkers in CHP included more negative regulations and the network structure of CHP was quite different from the network of IPF and control. Conclusion: The SVM classifier may serve as an important clinical tool to address the challenging task of differentiating between CHP and IPF. Many of the biomarker genes on the differential co-expression network showed great promise in revealing the underlying mechanisms of CHP.

Download Full-text

A fuzzy gaussian rank aggregation ensemble feature selection method for microarray data

International Journal of Knowledge-based and Intelligent Engineering Systems ◽

10.3233/kes-190134 ◽

2021 ◽

Vol 24 (4) ◽

pp. 289-301

Author(s):

B. Venkatesh ◽

J. Anuradha

Keyword(s):

Feature Selection ◽

Microarray Data ◽

Classification Accuracy ◽

Performance Metrics ◽

Feature Selection Method ◽

Selection Method ◽

Support Vector ◽

Svm Classifier ◽

Binary Particle Swarm Optimization ◽

Selection Methods

In Microarray Data, it is complicated to achieve more classification accuracy due to the presence of high dimensions, irrelevant and noisy data. And also It had more gene expression data and fewer samples. To increase the classification accuracy and the processing speed of the model, an optimal number of features need to extract, this can be achieved by applying the feature selection method. In this paper, we propose a hybrid ensemble feature selection method. The proposed method has two phases, filter and wrapper phase in filter phase ensemble technique is used for aggregating the feature ranks of the Relief, minimum redundancy Maximum Relevance (mRMR), and Feature Correlation (FC) filter feature selection methods. This paper uses the Fuzzy Gaussian membership function ordering for aggregating the ranks. In wrapper phase, Improved Binary Particle Swarm Optimization (IBPSO) is used for selecting the optimal features, and the RBF Kernel-based Support Vector Machine (SVM) classifier is used as an evaluator. The performance of the proposed model are compared with state of art feature selection methods using five benchmark datasets. For evaluation various performance metrics such as Accuracy, Recall, Precision, and F1-Score are used. Furthermore, the experimental results show that the performance of the proposed method outperforms the other feature selection methods.

Download Full-text

A Structural Analysis Based Feature Extraction Method for OCR System For Myanmar Printed Document Images

International Journal of Computer Vision and Image Processing ◽

10.4018/ijcvip.2012010102 ◽

2012 ◽

Vol 2 (1) ◽

pp. 16-41 ◽

Cited By ~ 1

Author(s):

Htwe Pa Pa Win ◽

Phyo Thu Thu Khine ◽

Khin Nwe Ni Tun

Keyword(s):

Feature Extraction ◽

Structural Analysis ◽

Character Recognition ◽

Optical Character Recognition ◽

Extraction Method ◽

Recognition Performance ◽

Extraction Methods ◽

Support Vector ◽

Svm Classifier ◽

Feature Extraction Method

This paper proposes a new feature extraction method for off-line recognition of Myanmar printed documents. One of the most important factors to achieve high recognition performance in Optical Character Recognition (OCR) system is the selection of the feature extraction methods. Different types of existing OCR systems used various feature extraction methods because of the diversity of the scripts’ natures. One major contribution of the work in this paper is the design of logically rigorous coding based features. To show the effectiveness of the proposed method, this paper assumed the documents are successfully segmented into characters and extracted features from these isolated Myanmar characters. These features are extracted using structural analysis of the Myanmar scripts. The experimental results have been carried out using the Support Vector Machine (SVM) classifier and compare the pervious proposed feature extraction method.

Download Full-text

Design of Text Categorization System Based on SVM

Advanced Materials Research ◽

10.4028/www.scientific.net/amr.532-533.1191 ◽

2012 ◽

Vol 532-533 ◽

pp. 1191-1195 ◽

Cited By ~ 1

Author(s):

Zhen Yan Liu ◽

Wei Ping Wang ◽

Yong Wang

Keyword(s):

Feature Extraction ◽

Feature Selection ◽

Text Categorization ◽

Feature Selection Method ◽

Extraction Methods ◽

Support Vector ◽

Text Representation ◽

Text Feature ◽

Categorization System ◽

Classifier Training

This paper introduces the design of a text categorization system based on Support Vector Machine (SVM). It analyzes the high dimensional characteristic of text data, the reason why SVM is suitable for text categorization. According to system data flow this system is constructed. This system consists of three subsystems which are text representation, classifier training and text classification. The core of this system is the classifier training, but text representation directly influences the currency of classifier and the performance of the system. Text feature vector space can be built by different kinds of feature selection and feature extraction methods. No research can indicate which one is the best method, so many feature selection and feature extraction methods are all developed in this system. For a specific classification task every feature selection method and every feature extraction method will be tested, and then a set of the best methods will be adopted.

Download Full-text

Shape-restricted support vector machine (SR-SVM): a SVM classifier taking supplementary shape information of input

Journal of Intelligent & Fuzzy Systems ◽

10.3233/jifs-202155 ◽

2021 ◽

Vol 40 (1) ◽

pp. 1481-1494

Author(s):

Geng Deng ◽

Yaoguo Xie ◽

Xindong Wang ◽

Qiang Fu

Keyword(s):

Support Vector Machine ◽

Classification Performance ◽

Research Literature ◽

Support Vector ◽

Svm Classifier ◽

Classification Problems ◽

Active Set ◽

Shape Information ◽

Convex Optimization Problem ◽

Shape Restrictions

Many classification problems contain shape information from input features, such as monotonic, convex, and concave. In this research, we propose a new classifier, called Shape-Restricted Support Vector Machine (SR-SVM), which takes the component-wise shape information to enhance classification accuracy. There exists vast research literature on monotonic classification covering monotonic or ordinal shapes. Our proposed classifier extends to handle convex and concave types of features, and combinations of these types. While standard SVM uses linear separating hyperplanes, our novel SR-SVM essentially constructs non-parametric and nonlinear separating planes subject to component-wise shape restrictions. We formulate SR-SVM classifier as a convex optimization problem and solve it using an active-set algorithm. The approach applies basis function expansions on the input and effectively utilizes the standard SVM solver. We illustrate our methodology using simulation and real world examples, and show that SR-SVM improves the classification performance with additional shape information of input.

Download Full-text

Feature Selection Method Based on Mutual Information and Support Vector Machine

International Journal of Pattern Recognition and Artificial Intelligence ◽

10.1142/s021800142150021x ◽

2021 ◽

pp. 2150021

Author(s):

Gang Liu ◽

Chunlei Yang ◽

Sen Liu ◽

Chunbao Xiao ◽

Bin Song

Keyword(s):

Support Vector Machine ◽

Feature Selection ◽

Mutual Information ◽

Classification Accuracy ◽

Feature Selection Method ◽

Selection Method ◽

Support Vector ◽

Svm Classifier ◽

Standard Data ◽

Feature Dimension

A feature selection method based on mutual information and support vector machine (SVM) is proposed in order to eliminate redundant feature and improve classification accuracy. First, local correlation between features and overall correlation is calculated by mutual information. The correlation reflects the information inclusion relationship between features, so the features are evaluated and redundant features are eliminated with analyzing the correlation. Subsequently, the concept of mean impact value (MIV) is defined and the influence degree of input variables on output variables for SVM network based on MIV is calculated. The importance weights of the features described with MIV are sorted by descending order. Finally, the SVM classifier is used to implement feature selection according to the classification accuracy of feature combination which takes MIV order of feature as a reference. The simulation experiments are carried out with three standard data sets of UCI, and the results show that this method can not only effectively reduce the feature dimension and high classification accuracy, but also ensure good robustness.

Download Full-text

Prediction of Diabetic Nephropathy from the Relationship between Fatigue, Sleep and Quality of Life

Applied Sciences ◽

10.3390/app10093282 ◽

2020 ◽

Vol 10 (9) ◽

pp. 3282

Author(s):

Angela Shin-Yu Lien ◽

Yi-Der Jiang ◽

Jia-Ling Tsai ◽

Jawl-Shan Hwang ◽

Wei-Chao Lin

Keyword(s):

Quality Of Life ◽

Feature Selection ◽

Diabetic Nephropathy ◽

Sleep Quality ◽

Support Vector ◽

Svm Classifier ◽

Poor Sleep ◽

Poor Sleep Quality ◽

The Relationship

Fatigue and poor sleep quality are the most common clinical complaints of people with diabetes mellitus (DM). These complaints are early signs of DM and are closely related to diabetic control and the presence of complications, which lead to a decline in the quality of life. Therefore, an accurate measurement of the relationship between fatigue, sleep status, and the complication of DM nephropathy could lead to a specific definition of fatigue and an appropriate medical treatment. This study recruited 307 people with Type 2 diabetes from two medical centers in Northern Taiwan through a questionnaire survey and a retrospective investigation of medical records. In an attempt to identify the related factors and accurately predict diabetic nephropathy, we applied hybrid research methods, integrated biostatistics, and feature selection methods in data mining and machine learning to compare and verify the results. Consequently, the results demonstrated that patients with diabetic nephropathy have a higher fatigue level and Charlson comorbidity index (CCI) score than without neuropathy, the presence of neuropathy leads to poor sleep quality, lower quality of life, and poor metabolism. Furthermore, by considering feature selection in selecting representative features or variables, we achieved consistence results with a support vector machine (SVM) classifier and merely ten representative factors and a prediction accuracy as high as 74% in predicting the presence of diabetic nephropathy.

Download Full-text

The Evaluation of Accuracy Performance in an Enhanced Embedded Feature Selection for Unstructured Text Classification

Iraqi Journal of Science ◽

10.24996/ijs.2020.61.12.28 ◽

2020 ◽

pp. 3397-3407

Author(s):

Nur Syafiqah Mohd Nafis ◽

Suryanti Awang

Keyword(s):

Feature Selection ◽

Text Classification ◽

Training Dataset ◽

Recursive Feature Elimination ◽

High Dimensional ◽

Significant Feature ◽

Support Vector ◽

Svm Classifier ◽

Text Documents ◽

Text Document

Text documents are unstructured and high dimensional. Effective feature selection is required to select the most important and significant feature from the sparse feature space. Thus, this paper proposed an embedded feature selection technique based on Term Frequency-Inverse Document Frequency (TF-IDF) and Support Vector Machine-Recursive Feature Elimination (SVM-RFE) for unstructured and high dimensional text classificationhis technique has the ability to measure the feature’s importance in a high-dimensional text document. In addition, it aims to increase the efficiency of the feature selection. Hence, obtaining a promising text classification accuracy. TF-IDF act as a filter approach which measures features importance of the text documents at the first stage. SVM-RFE utilized a backward feature elimination scheme to recursively remove insignificant features from the filtered feature subsets at the second stage. This research executes sets of experiments using a text document retrieved from a benchmark repository comprising a collection of Twitter posts. Pre-processing processes are applied to extract relevant features. After that, the pre-processed features are divided into training and testing datasets. Next, feature selection is implemented on the training dataset by calculating the TF-IDF score for each feature. SVM-RFE is applied for feature ranking as the next feature selection step. Only top-rank features will be selected for text classification using the SVM classifier. Based on the experiments, it shows that the proposed technique able to achieve 98% accuracy that outperformed other existing techniques. In conclusion, the proposed technique able to select the significant features in the unstructured and high dimensional text document.

Download Full-text