scholarly journals Logistic Regression based Feature Selection and Two-Stage Detection for EEG based Motor Imagery Classification

2021 ◽  
Vol 14 (1) ◽  
pp. 134-146
Author(s):  
Adi Wijaya ◽  
◽  
Teguh Adji ◽  
Noor Setiawan ◽  
◽  
...  

Electroencephalogram (EEG) based motor imagery (MI) classification requires efficient feature extraction and consistent accuracy for reliable brain-computer interface (BCI) systems. Achieving consistent accuracy in EEGMI classification is still big challenge according to the nature of EEG signal which is subject dependent. To address this problem, we propose a feature selection scheme based on Logistic Regression (LRFS) and two-stage detection (TSD) in channel instantiation approach. In TSD scheme, Linear Discriminant Analysis was utilized in first-stage detection; while Gradient Boosted Tree and k-Nearest Neighbor in second-stage detection. To evaluate the proposed method, two publicly available datasets, BCI competition III-Dataset IVa and BCI competition IV-Dataset 2a, were used. Experimental results show that the proposed method yielded excellent accuracy for both datasets with 95.21% and 94.83%, respectively. These results indicated that the proposed method has consistent accuracy and is promising for reliable BCI systems.

2018 ◽  
Vol 9 (2) ◽  
pp. 48-71 ◽  
Author(s):  
Khadidja Belattar ◽  
Sihem Mostefai ◽  
Amer Draa

Feature selection is an important pre-processing technique in the pattern recognition domain. This article proposes a hybridization between Genetic Algorithm (GA) and the Linear Discriminant Analysis (LDA) for solving the feature selection problem in Content-Based Image Retrieval (CBIR) applied to dermatological images. In the first step, we preprocess and segment the input image, then we derive color and texture features characterizing healthy skin and the segmented skin lesion. At this stage, a binary GA is used to evolve chromosome subsets whose fitness is evaluated by a Logistic Regression classifier. The optimal identified features are then used to feed LDA for a CBIR system, based on a K-Nearest Neighbor classification. To assess the proposed approach, the authors have opted for a K-fold cross validation method on a database of 1097 images of melanomas and other skin lesions. As a result, the authors obtained a reduced number of features and an improved CBDIR system compared to PCA, LDA and ICA methods.


2019 ◽  
Vol 2 (3) ◽  
pp. 250-263 ◽  
Author(s):  
Peter Boedeker ◽  
Nathan T. Kearns

In psychology, researchers are often interested in the predictive classification of individuals. Various models exist for such a purpose, but which model is considered a best practice is conditional on attributes of the data. Under certain conditions, linear discriminant analysis (LDA) has been shown to perform better than other predictive methods, such as logistic regression, multinomial logistic regression, random forests, support-vector machines, and the K-nearest neighbor algorithm. The purpose of this Tutorial is to provide researchers who already have a basic level of statistical training with a general overview of LDA and an example of its implementation and interpretation. Decisions that must be made when conducting an LDA (e.g., prior specification, choice of cross-validation procedures) and methods of evaluating case classification (posterior probability, typicality probability) and overall classification (hit rate, Huberty’s I index) are discussed. LDA for prediction is described from a modern Bayesian perspective, as opposed to its original derivation. A step-by-step example of implementing and interpreting LDA results is provided. All analyses were conducted in R, and the script is provided; the data are available online.


Author(s):  
Mien Van ◽  
Hee-Jun Kang

This paper presents an automatic fault diagnosis of different rolling element bearing faults using a dual-tree complex wavelet transform, empirical mode decomposition, and a novel two-stage feature selection technique. In this method, dual-tree complex wavelet transform and empirical mode decomposition were used to preprocess the original vibration signal to obtain more accurate fault characteristic information. Then, features in the time domain were extracted from each of the original signals, the coefficients of the dual-tree complex wavelet transform, and some useful intrinsic mode functions to generate a rich combined feature set. Next, a two-stage feature selection algorithm was proposed to generate the smallest set of features that leads to the superior classification accuracy. In the first stage of the two-stage feature selection, we found the candidate feature set using the distance evaluation technique and a k-nearest neighbor classifier. In the second stage, a genetic algorithm-based k-nearest neighbor classifier was designed to obtain the superior combination of features from the candidate feature set with respect to the classification accuracy and number of feature inputs. Finally, the selected features were used as the input to a k-nearest neighbor classifier to evaluate the system diagnosis performance. The experimental results obtained from real bearing vibration signals demonstrated that the method combining dual-tree complex wavelet transform, empirical mode decomposition, and the two-stage feature selection technique is effective in both feature extraction and feature selection, which also increase classification accuracy.


Author(s):  
Yuita Arum Sari ◽  
Anggi Gustiningsih Hapsani ◽  
Sigit Adinugroho ◽  
Lukman Hakim ◽  
Siti Mutrofin

Preprocessing is an essential part to achieve good segmentation since it affects the feature extraction process. Melanoma have various shapes and their extracted features from image are used for early stage detection. Due to the fact that melanoma is one of dangerous diseases, early detection is required to prevent further phase of cancer from developing. In this paper, we propose a new framework to detect cancer on skin images using color feature extraction and feature selection. The default color space of skin images is RGB, then brightness is added to distinguish the normal and darken area on the skin. After that, average filter and histogram equalization are applied as well for attaining a good color intensities which are capable of determining normal skin from suspicious one. Otsu thresholding is utilized afterwards for melanoma segmentation. There are 147 features extracted from segmented images. Those features are reduced using three types of feature selection algorithms: Linear Discriminant Analysis (LDA), Correlation based Feature Selection (CFS), and Relief. All selected features are classified using k-Nearest Neighbor  (k-NN). Relief is known to be the best feature selection method among others and the optimal k value is 7 with 10-cross validation with accuracy of 0.835 and 0.845, without and with feature selection respectively. The result indicates that the frameworks is applicable for early skin cancer detection.


2019 ◽  
Vol 20 (5) ◽  
pp. 488-500 ◽  
Author(s):  
Yan Hu ◽  
Yi Lu ◽  
Shuo Wang ◽  
Mengying Zhang ◽  
Xiaosheng Qu ◽  
...  

Background: Globally the number of cancer patients and deaths are continuing to increase yearly, and cancer has, therefore, become one of the world&#039;s highest causes of morbidity and mortality. In recent years, the study of anticancer drugs has become one of the most popular medical topics. </P><P> Objective: In this review, in order to study the application of machine learning in predicting anticancer drugs activity, some machine learning approaches such as Linear Discriminant Analysis (LDA), Principal components analysis (PCA), Support Vector Machine (SVM), Random forest (RF), k-Nearest Neighbor (kNN), and Naïve Bayes (NB) were selected, and the examples of their applications in anticancer drugs design are listed. </P><P> Results: Machine learning contributes a lot to anticancer drugs design and helps researchers by saving time and is cost effective. However, it can only be an assisting tool for drug design. </P><P> Conclusion: This paper introduces the application of machine learning approaches in anticancer drug design. Many examples of success in identification and prediction in the area of anticancer drugs activity prediction are discussed, and the anticancer drugs research is still in active progress. Moreover, the merits of some web servers related to anticancer drugs are mentioned.


2015 ◽  
Vol 83 ◽  
pp. 81-91 ◽  
Author(s):  
Aiguo Wang ◽  
Ning An ◽  
Guilin Chen ◽  
Lian Li ◽  
Gil Alterovitz

2020 ◽  
Author(s):  
Qi Zhang ◽  
Shan Li ◽  
Bin Yu ◽  
Yang Li ◽  
Yandan Zhang ◽  
...  

ABSTRACTProteins play a significant part in life processes such as cell growth, development, and reproduction. Exploring protein subcellular localization (SCL) is a direct way to better understand the function of proteins in cells. Studies have found that more and more proteins belong to multiple subcellular locations, and these proteins are called multi-label proteins. They not only play a key role in cell life activities, but also play an indispensable role in medicine and drug development. This article first presents a new prediction model, MpsLDA-ProSVM, to predict the SCL of multi-label proteins. Firstly, the physical and chemical information, evolution information, sequence information and annotation information of protein sequences are fused. Then, for the first time, use a weighted multi-label linear discriminant analysis framework based on entropy weight form (wMLDAe) to refine and purify features, reduce the difficulty of learning. Finally, input the optimal feature subset into the multi-label learning with label-specific features (LIFT) and multi-label k-nearest neighbor (ML-KNN) algorithms to obtain a synthetic ranking of relevant labels, and then use Prediction and Relevance Ordering based SVM (ProSVM) classifier to predict the SCLs. This method can rank and classify related tags at the same time, which greatly improves the efficiency of the model. Tested by jackknife method, the overall actual accuracy (OAA) on virus, plant, Gram-positive bacteria and Gram-negative bacteria datasets are 98.06%, 98.97%, 99.81% and 98.49%, which are 0.56%-9.16%, 5.37%-30.87%, 3.51%-6.91% and 3.99%-8.59% higher than other advanced methods respectively. The source codes and datasets are available at https://github.com/QUST-AIBBDRC/MpsLDA-ProSVM/.


Sign in / Sign up

Export Citation Format

Share Document