fisher score
Recently Published Documents


TOTAL DOCUMENTS

97
(FIVE YEARS 58)

H-INDEX

8
(FIVE YEARS 3)

Author(s):  
Touria Hamim ◽  
Faouzia Benabbou ◽  
Nawal Sael

The student profile has become an important component of education systems. Many systems objectives, as e-recommendation, e-orientation, e-recruitment and dropout prediction are essentially based on the profile for decision support. Machine learning plays an important role in this context and several studies have been carried out either for classification, prediction or clustering purpose. In this paper, the authors present a comparative study between different boosting algorithms which have been used successfully in many fields and for many purposes. In addition, the authors applied feature selection methods Fisher Score, Information Gain combined with Recursive Feature Elimination to enhance the preprocessing task and models’ performances. Using multi-label dataset predict the class of the student performance in mathematics, this article results show that the Light Gradient Boosting Machine (LightGBM) algorithm achieved the best performance when using Information gain with Recursive Feature Elimination method compared to the other boosting algorithms.


2022 ◽  
Vol 2022 ◽  
pp. 1-12
Author(s):  
Yuan Tang ◽  
Zining Zhao ◽  
Shaorong Zhang ◽  
Zhi Li ◽  
Yun Mo ◽  
...  

Feature extraction and selection are important parts of motor imagery electroencephalogram (EEG) decoding and have always been the focus and difficulty of brain-computer interface (BCI) system research. In order to improve the accuracy of EEG decoding and reduce model training time, new feature extraction and selection methods are proposed in this paper. First, a new spatial-frequency feature extraction method is proposed. The original EEG signal is preprocessed, and then the common spatial pattern (CSP) is used for spatial filtering and dimensionality reduction. Finally, the filter bank method is used to decompose the spatially filtered signals into multiple frequency subbands, and the logarithmic band power feature of each frequency subband is extracted. Second, to select the subject-specific spatial-frequency features, a hybrid feature selection method based on the Fisher score and support vector machine (SVM) is proposed. The Fisher score of each feature is calculated, then a series of threshold parameters are set to generate different feature subsets, and finally, SVM and cross-validation are used to select the optimal feature subset. The effectiveness of the proposed method is validated using two sets of publicly available BCI competition data and a set of self-collected data. The total average accuracy of the three data sets achieved by the proposed method is 82.39%, which is 2.99% higher than the CSP method. The experimental results show that the proposed method has a better classification effect than the existing methods, and at the same time, feature extraction and feature selection time also have greater advantages.


2021 ◽  
Vol 1207 (1) ◽  
pp. 012008
Author(s):  
Yiyuan Gao ◽  
Wenliao Du ◽  
Xiaoyun Gong ◽  
Dejie Yu

Abstract To more effectively extract the non-stationary and non-linear fault features of mechanical vibration signals, a novel fault diagnosis method for rotating machinery is proposed combining time-domain, frequency-domain with graph-domain features. Different from the conventional time-domain and frequency-domain features, the graph-domain features generated from horizontal visibility graphs can extract the fault information hidden in the graph topology. Aiming at the problem that too many features will lead to information redundancy, the Fisher score algorithm is applied to select several of sensitive features which are then fed into the support vector machine to diagnose the faults of rotating machinery. Experimental results indicate features extracted from the three domains can be used to obtain higher diagnosis accuracy than that extracted from any single domain or dual domains.


2021 ◽  
Vol 11 (2) ◽  
pp. 73-80
Author(s):  
Sharin Hazlin Huspi ◽  
Chong Ke Ting

Kidney failure will give effect to the human body, and it can lead to a series of seriously illness and even causing death. Machine learning plays important role in disease classification with high accuracy and shorter processing time as compared to clinical lab test. There are 24 attributes in the Chronic K idney Disease (CKD) clinical dataset, which is considered as too much of attributes. To improve the performance of the classification, filter feature selection methods used to reduce the dimensions of the feature and then the ensemble algorithm is used to identify the union features that selected from each filter feature selection. The filter feature selection that implemented in this research are Information Gain (IG), Chi-Squares, ReliefF and Fisher Score. Genetic Algorithm (GA) is used to select the best subset from the ensemble result of the filter feature selection. In this research, Random Forest (RF), XGBoost, Support Vector Machine (SVM), K-Nearest Neighbor (KNN) and Naïve Bayes classification techniques were used to diagnose the CKD. The features subset that selected are different and specialised for each classifier. By implementing the proposed method irrelevant features through filter feature selection able to reduce the burden and computational cost for the genetic algorithm. Then, the genetic algorithm able to perform better and select the best subset that able to improve the performance of the classifier with less attributes. The proposed genetic algorithm union filter feature selections improve the performance of the classification algorithm. The accuracy of RF, XGBoost, KNN and SVM can achieve to 100% and NB can achieve to 99.17%. The proposed method successfully improves the performance of the classifier by using less features as compared to other previous work.


2021 ◽  
Vol 7 (1) ◽  
Author(s):  
Xiaoxi Zhang ◽  
Haishuang Tang ◽  
Qiao Zuo ◽  
Gaici Xue ◽  
Guoli Duan ◽  
...  

Abstract Background Early treatment for patients with aneurysmal subarachnoid hemorrhage (aSAH) could significantly reduce the risk of re-bleeding and improve clinical outcomes. We assessed the different time intervals from the initial hemorrhage, admission, and endovascular treatment and identified the risk factors contributing to delay. Methods Between February 2017 and December 2019, 422 consecutive aSAH patients treated in a high-volume hospital were collected and reviewed. Risk factors contributing to admission delay and treatment delay were analyzed with univariate and multivariate analysis. Results One hundred twenty-two (28.9%) were admitted to the high-volume hospital at the day of symptom onset and 386 (91.5%) were treated with endovascular management at the same day of admission. The multivariate analysis found that younger age (P = 0.022, OR = 0.981, 95% CI 0.964–0.997) and good Fisher score (P = 0.002, OR = 0.420, 95% CI 0.245–0.721) were independent risk factors of admission delay. None was found to be related with treatment delay. Multivariate analysis (OR (95% CI)) showed that higher age 1.027 (1.004–1.050), poorer Fisher score 3.496 (1.993–6.135), larger aneurysmal size 1.112 (1.017–1.216), and shorter interval between onset to admission 1.845 (1.018–3.344) were independent risk factors of poorer clinical outcome. Conclusion Treatment delay was mainly caused by pre-hospital delay including delayed admission and delayed transfer. Our experience showed that cerebrovascular team could provide early treatment for aSAH patients. Younger age and good Fisher score were significantly related with admission delay. However, admission delay was further significantly correlated with better clinical outcome.


2021 ◽  
Vol 2021 ◽  
pp. 1-13
Author(s):  
Moumita Pramanik ◽  
Ratika Pradhan ◽  
Parvati Nandy ◽  
Saeed Mian Qaisar ◽  
Akash Kumar Bhoi

This article presents a machine learning approach for Parkinson’s disease detection. Potential multiple acoustic signal features of Parkinson’s and control subjects are ascertained. A collaborated feature bank is created through correlated feature selection, Fisher score feature selection, and mutual information-based feature selection schemes. A detection model on top of the feature bank has been developed using the traditional Naïve Bayes, which proved state of the art. The Naïve Bayes detector on collaborative acoustic features can detect the presence of Parkinson’s magnificently with a detection accuracy of 78.97% and precision of 0.926, under the hold-out cross validation. The collaborative feature bank on Naïve Bayes revealed distinguishable results as compared to many other recently proposed approaches. The simplicity of Naïve Bayes makes the system robust and effective throughout the detection process.


CONVERTER ◽  
2021 ◽  
pp. 829-836
Author(s):  
Yanjun Zhang

This study constructed a corpus by collecting samples of applications in exercise health information. Based on the extracted 8 eigen values, this study used the Fisher score algorithm to score the eigen values and then formed a combination of 8 groups of eigen values according to the score ranking. This study performed computerized automatic classification using cross-checking based on three methods, support vector machines, neural networks, and Naive Bayes model. By analyzing the experimental results, the optimal eigen value combination and the optimal classification algorithm were finally derived in this study.


2021 ◽  
Vol 2021 ◽  
pp. 1-8
Author(s):  
Linyan Li

This work was aimed at investigating image feature recognition and clinical nursing of children’s rheumatoid arthritis- (CRA-) related lung injury under maximum correlation minimum redundancy algorithm of machine learning. In this study, 18 children with CRA in the hospital were selected as the rheumatoid group to explore the nursing method, and 18 healthy children were selected as the control group. The maximum correlation minimum redundancy algorithm of machine learning was compared with the information gain algorithm and the Fisher score algorithm and applied in computed tomography (CT) images of 18 CRA children. The classification accuracy of the algorithm in this study (94.52%) was higher than that of the information gain algorithm (88.64%) and Fisher score algorithm (81.24%). CT alveolitis score (2.35 ± 0.72 points) of children from the rheumatoid group was markedly higher than that of the control group (1.21 ± 0.24 points) (t = 2.147 and P < 0.05 ). The nitric oxide level (14.00 ppb) of children from the rheumatoid group increased greatly compared with the control group (10.00 ppb) ( P < 0.05 ). CRA can cause a decline of lung function in children, while the nitric oxide level exhaled by children can assess the activity of RA. In addition, adopting active nursing methods can help children get better.


2021 ◽  
Vol 8 (1) ◽  
Author(s):  
Atousa Ataei ◽  
Niloufar Seyed Majidi ◽  
Javad Zahiri ◽  
Mehrdad Rostami ◽  
S. Shahriar Arab ◽  
...  

AbstractMost of the current cancer treatment approaches are invasive along with a broad spectrum of side effects. Furthermore, cancer drug resistance known as chemoresistance is a huge obstacle during treatment. This study aims to predict the resistance of several cancer cell-lines to a drug known as Cisplatin. In this papers the NCBI GEO database was used to obtain data and then the harvested data was normalized and its batch effects were corrected by the Combat software. In order to select the appropriate features for machine learning, the feature selection/reduction was performed based on the Fisher Score method. Six different algorithms were then used as machine learning algorithms to detect Cisplatin resistant and sensitive samples in cancer cell lines. Moreover, Differentially Expressed Genes (DEGs) between all the sensitive and resistance samples were harvested. The selected genes were enriched in biological pathways by the enrichr database. Topological analysis was then performed on the constructed networks using Cytoscape software. Finally, the biological description of the output genes from the performed analyses was investigated through literature review. Among the six classifiers which were trained to distinguish between cisplatin resistance samples and the sensitive ones, the KNN and the Naïve Bayes algorithms were proposed as the most convenient machines according to some calculated measures. Furthermore, the results of the systems biology analysis determined several potential chemoresistance genes among which PTGER3, YWHAH, CTNNB1, ANKRD50, EDNRB, ACSL6, IFNG and, CTNNB1 are topologically more important than others. These predictions pave the way for further experimental researches.


Sign in / Sign up

Export Citation Format

Share Document