scholarly journals Unsupervised Minimum Redundancy Maximum Relevance Feature Selection for Predictive Maintenance

Author(s):  
Valentin Hamaide ◽  
François Glineur

Identifying and selecting optimal prognostic health indicators in the context of predictive maintenance is essential to obtain a good model and make accurate predictions. Several metrics have been proposed in the past decade to quantify the relevance of those prognostic parameters. Other works have used the well-known minimum redundancy maximum relevance (mRMR) algorithm to select features that are both relevant and non-redundant. However, the relevance criterion is based on labelled machine malfunctions which are not always available in real life scenarios. In this paper, we develop a prognostic mRMR feature selection, an adaptation of the conventional mRMR algorithm, to a situation where class labels are a priori unknown, which we call unsupervised feature selection. In addition, this paper proposes new metrics for computing the relevance and compares different methods to estimate redundancy between features. We show that using unsupervised feature selection as well as adapting relevance metrics with the dynamic time warping algorithm help increase the effectiveness of the selection of health indicators for a rotating machine case study.

Author(s):  
RONG LIU ◽  
ROBERT RALLO ◽  
YORAM COHEN

An unsupervised feature selection method is proposed for analysis of datasets of high dimensionality. The least square error (LSE) of approximating the complete dataset via a reduced feature subset is proposed as the quality measure for feature selection. Guided by the minimization of the LSE, a kernel least squares forward selection algorithm (KLS-FS) is developed that is capable of both linear and non-linear feature selection. An incremental LSE computation is designed to accelerate the selection process and, therefore, enhances the scalability of KLS-FS to high-dimensional datasets. The superiority of the proposed feature selection algorithm, in terms of keeping principal data structures, learning performances in classification and clustering applications, and robustness, is demonstrated using various real-life datasets of different sizes and dimensions.


2020 ◽  
Author(s):  
Yasin Kaymaz ◽  
Florian Ganglberger ◽  
Ming Tang ◽  
Francesc Fernandez-Albert ◽  
Nathan Lawless ◽  
...  

AbstractThe emergence of single-cell RNA sequencing (scRNA-seq) has led to an explosion in novel methods to study biological variation among individual cells, and to classify cells into functional and biologically meaningful categories. Here, we present a new cell type projection tool, HieRFIT (Hierarchical Random Forest for Information Transfer), based on hierarchical random forests. HieRFIT uses a priori information about cell type relationships to improve classification accuracy, taking as input a hierarchical tree structure representing the class relationships, along with the reference data. We use an ensemble approach combining multiple random forest models, organized in a hierarchical decision tree structure. We show that our hierarchical classification approach improves accuracy and reduces incorrect predictions especially for inter-dataset tasks which reflect real life applications. We use a scoring scheme that adjusts probability distributions for candidate class labels and resolves uncertainties while avoiding the assignment of cells to incorrect types by labeling cells at internal nodes of the hierarchy when necessary. Using HieRFIT, we re-analyzed publicly available scRNA-seq datasets showing its effectiveness in cell type cross-projections with inter/intra-species examples. HieRFIT is implemented as an R package and it is available at (https://github.com/yasinkaymaz/HieRFIT/releases/tag/v1.0.0)


2020 ◽  
Vol 6 (1) ◽  
Author(s):  
Spyridoula Vazou ◽  
Collin A. Webster ◽  
Gregory Stewart ◽  
Priscila Candal ◽  
Cate A. Egan ◽  
...  

Abstract Background/Objective Movement integration (MI) involves infusing physical activity into normal classroom time. A wide range of MI interventions have succeeded in increasing children’s participation in physical activity. However, no previous research has attempted to unpack the various MI intervention approaches. Therefore, this study aimed to systematically review, qualitatively analyze, and develop a typology of MI interventions conducted in primary/elementary school settings. Subjects/Methods Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) guidelines were followed to identify published MI interventions. Irrelevant records were removed first by title, then by abstract, and finally by full texts of articles, resulting in 72 studies being retained for qualitative analysis. A deductive approach, using previous MI research as an a priori analytic framework, alongside inductive techniques were used to analyze the data. Results Four types of MI interventions were identified and labeled based on their design: student-driven, teacher-driven, researcher-teacher collaboration, and researcher-driven. Each type was further refined based on the MI strategies (movement breaks, active lessons, other: opening activity, transitions, reward, awareness), the level of intrapersonal and institutional support (training, resources), and the delivery (dose, intensity, type, fidelity). Nearly half of the interventions were researcher-driven, which may undermine the sustainability of MI as a routine practice by teachers in schools. An imbalance is evident on the MI strategies, with transitions, opening and awareness activities, and rewards being limitedly studied. Delivery should be further examined with a strong focus on reporting fidelity. Conclusions There are distinct approaches that are most often employed to promote the use of MI and these approaches may often lack a minimum standard for reporting MI intervention details. This typology may be useful to effectively translate the evidence into practice in real-life settings to better understand and study MI interventions.


Sensors ◽  
2021 ◽  
Vol 21 (11) ◽  
pp. 3627
Author(s):  
Bo Jin ◽  
Chunling Fu ◽  
Yong Jin ◽  
Wei Yang ◽  
Shengbin Li ◽  
...  

Identifying the key genes related to tumors from gene expression data with a large number of features is important for the accurate classification of tumors and to make special treatment decisions. In recent years, unsupervised feature selection algorithms have attracted considerable attention in the field of gene selection as they can find the most discriminating subsets of genes, namely the potential information in biological data. Recent research also shows that maintaining the important structure of data is necessary for gene selection. However, most current feature selection methods merely capture the local structure of the original data while ignoring the importance of the global structure of the original data. We believe that the global structure and local structure of the original data are equally important, and so the selected genes should maintain the essential structure of the original data as far as possible. In this paper, we propose a new, adaptive, unsupervised feature selection scheme which not only reconstructs high-dimensional data into a low-dimensional space with the constraint of feature distance invariance but also employs ℓ2,1-norm to enable a matrix with the ability to perform gene selection embedding into the local manifold structure-learning framework. Moreover, an effective algorithm is developed to solve the optimization problem based on the proposed scheme. Comparative experiments with some classical schemes on real tumor datasets demonstrate the effectiveness of the proposed method.


2021 ◽  
Author(s):  
Yan Min ◽  
Mao Ye ◽  
Liang Tian ◽  
Yulin Jian ◽  
Ce Zhu ◽  
...  

Sign in / Sign up

Export Citation Format

Share Document