Unsupervised Minimum Redundancy Maximum Relevance Feature Selection for Predictive Maintenance

Identifying and selecting optimal prognostic health indicators in the context of predictive maintenance is essential to obtain a good model and make accurate predictions. Several metrics have been proposed in the past decade to quantify the relevance of those prognostic parameters. Other works have used the well-known minimum redundancy maximum relevance (mRMR) algorithm to select features that are both relevant and non-redundant. However, the relevance criterion is based on labelled machine malfunctions which are not always available in real life scenarios. In this paper, we develop a prognostic mRMR feature selection, an adaptation of the conventional mRMR algorithm, to a situation where class labels are a priori unknown, which we call unsupervised feature selection. In addition, this paper proposes new metrics for computing the relevance and compares different methods to estimate redundancy between features. We show that using unsupervised feature selection as well as adapting relevance metrics with the dynamic time warping algorithm help increase the effectiveness of the selection of health indicators for a rotating machine case study.

Download Full-text

UNSUPERVISED FEATURE SELECTION USING INCREMENTAL LEAST SQUARES

International Journal of Information Technology & Decision Making ◽

10.1142/s0219622011004671 ◽

2011 ◽

Vol 10 (06) ◽

pp. 967-987 ◽

Cited By ~ 14

Author(s):

RONG LIU ◽

ROBERT RALLO ◽

YORAM COHEN

Keyword(s):

Feature Selection ◽

Least Squares ◽

Selection Process ◽

Real Life ◽

Feature Selection Method ◽

Least Square ◽

Feature Subset ◽

Selection Algorithm ◽

Forward Selection ◽

Unsupervised Feature Selection

An unsupervised feature selection method is proposed for analysis of datasets of high dimensionality. The least square error (LSE) of approximating the complete dataset via a reduced feature subset is proposed as the quality measure for feature selection. Guided by the minimization of the LSE, a kernel least squares forward selection algorithm (KLS-FS) is developed that is capable of both linear and non-linear feature selection. An incremental LSE computation is designed to accelerate the selection process and, therefore, enhances the scalability of KLS-FS to high-dimensional datasets. The superiority of the proposed feature selection algorithm, in terms of keeping principal data structures, learning performances in classification and clustering applications, and robustness, is demonstrated using various real-life datasets of different sizes and dimensions.

Download Full-text

HieRFIT: Hierarchical Random Forest for Information Transfer

10.1101/2020.09.16.300822 ◽

2020 ◽

Author(s):

Yasin Kaymaz ◽

Florian Ganglberger ◽

Ming Tang ◽

Francesc Fernandez-Albert ◽

Nathan Lawless ◽

...

Keyword(s):

Random Forest ◽

Information Transfer ◽

Probability Distributions ◽

A Priori ◽

Real Life ◽

Hierarchical Classification ◽

R Package ◽

Tree Structure ◽

Cell Type ◽

Class Labels

AbstractThe emergence of single-cell RNA sequencing (scRNA-seq) has led to an explosion in novel methods to study biological variation among individual cells, and to classify cells into functional and biologically meaningful categories. Here, we present a new cell type projection tool, HieRFIT (Hierarchical Random Forest for Information Transfer), based on hierarchical random forests. HieRFIT uses a priori information about cell type relationships to improve classification accuracy, taking as input a hierarchical tree structure representing the class relationships, along with the reference data. We use an ensemble approach combining multiple random forest models, organized in a hierarchical decision tree structure. We show that our hierarchical classification approach improves accuracy and reduces incorrect predictions especially for inter-dataset tasks which reflect real life applications. We use a scoring scheme that adjusts probability distributions for candidate class labels and resolves uncertainties while avoiding the assignment of cells to incorrect types by labeling cells at internal nodes of the hierarchy when necessary. Using HieRFIT, we re-analyzed publicly available scRNA-seq datasets showing its effectiveness in cell type cross-projections with inter/intra-species examples. HieRFIT is implemented as an R package and it is available at (https://github.com/yasinkaymaz/HieRFIT/releases/tag/v1.0.0)

Download Full-text

Block Model Guided Unsupervised Feature Selection

Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining ◽

10.1145/3394486.3403173 ◽

2020 ◽

Author(s):

Zilong Bai ◽

Hoa Nguyen ◽

Ian Davidson

Keyword(s):

Feature Selection ◽

Block Model ◽

Unsupervised Feature Selection

Download Full-text

Unsupervised Feature Selection With Extended OLSDA via Embedding Nonnegative Manifold Structure

IEEE Transactions on Neural Networks and Learning Systems ◽

10.1109/tnnls.2020.3045053 ◽

2021 ◽

pp. 1-7

Author(s):

Rui Zhang ◽

Hongyuan Zhang ◽

Xuelong Li ◽

Sheng Yang

Keyword(s):

Feature Selection ◽

Unsupervised Feature Selection ◽

Manifold Structure

Download Full-text

Unsupervised Feature Selection using Pseudo Label Approximation

2021 13th International Conference on Machine Learning and Computing ◽

10.1145/3457682.3457758 ◽

2021 ◽

Author(s):

Ren Deng ◽

Ye Liu ◽

Liyan Luo ◽

DongJing Chen ◽

Xijie Li

Keyword(s):

Feature Selection ◽

Unsupervised Feature Selection

Download Full-text

A Systematic Review and Qualitative Synthesis Resulting in a Typology of Elementary Classroom Movement Integration Interventions

Sports Medicine - Open ◽

10.1186/s40798-019-0218-8 ◽

2020 ◽

Vol 6 (1) ◽

Cited By ~ 2

Author(s):

Spyridoula Vazou ◽

Collin A. Webster ◽

Gregory Stewart ◽

Priscila Candal ◽

Cate A. Egan ◽

...

Keyword(s):

Physical Activity ◽

Teacher Collaboration ◽

A Priori ◽

Real Life ◽

Dose Intensity ◽

Routine Practice ◽

Qualitative Synthesis ◽

Wide Range ◽

Meta Analyses ◽

Movement Integration

Abstract Background/Objective Movement integration (MI) involves infusing physical activity into normal classroom time. A wide range of MI interventions have succeeded in increasing children’s participation in physical activity. However, no previous research has attempted to unpack the various MI intervention approaches. Therefore, this study aimed to systematically review, qualitatively analyze, and develop a typology of MI interventions conducted in primary/elementary school settings. Subjects/Methods Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) guidelines were followed to identify published MI interventions. Irrelevant records were removed first by title, then by abstract, and finally by full texts of articles, resulting in 72 studies being retained for qualitative analysis. A deductive approach, using previous MI research as an a priori analytic framework, alongside inductive techniques were used to analyze the data. Results Four types of MI interventions were identified and labeled based on their design: student-driven, teacher-driven, researcher-teacher collaboration, and researcher-driven. Each type was further refined based on the MI strategies (movement breaks, active lessons, other: opening activity, transitions, reward, awareness), the level of intrapersonal and institutional support (training, resources), and the delivery (dose, intensity, type, fidelity). Nearly half of the interventions were researcher-driven, which may undermine the sustainability of MI as a routine practice by teachers in schools. An imbalance is evident on the MI strategies, with transitions, opening and awareness activities, and rewards being limitedly studied. Delivery should be further examined with a strong focus on reporting fidelity. Conclusions There are distinct approaches that are most often employed to promote the use of MI and these approaches may often lack a minimum standard for reporting MI intervention details. This typology may be useful to effectively translate the evidence into practice in real-life settings to better understand and study MI interventions.

Download Full-text

Cross-view Locality Preserved Diversity and Consensus Learning for Multi-view Unsupervised Feature Selection

IEEE Transactions on Knowledge and Data Engineering ◽

10.1109/tkde.2020.3048678 ◽

2021 ◽

pp. 1-1

Author(s):

Chang Tang ◽

Xiao Zheng ◽

Xinwang Liu ◽

Wei Zhang ◽

Jing Zhang ◽

...

Keyword(s):

Feature Selection ◽

Unsupervised Feature Selection

Download Full-text

An Adaptive Unsupervised Feature Selection Algorithm Based on MDS for Tumor Gene Data Classification

Sensors ◽

10.3390/s21113627 ◽

2021 ◽

Vol 21 (11) ◽

pp. 3627

Author(s):

Bo Jin ◽

Chunling Fu ◽

Yong Jin ◽

Wei Yang ◽

Shengbin Li ◽

...

Keyword(s):

Feature Selection ◽

Local Structure ◽

Gene Selection ◽

Dimensional Space ◽

Original Data ◽

Global Structure ◽

Biological Data ◽

Special Treatment ◽

Selection Scheme ◽

Unsupervised Feature Selection

Identifying the key genes related to tumors from gene expression data with a large number of features is important for the accurate classification of tumors and to make special treatment decisions. In recent years, unsupervised feature selection algorithms have attracted considerable attention in the field of gene selection as they can find the most discriminating subsets of genes, namely the potential information in biological data. Recent research also shows that maintaining the important structure of data is necessary for gene selection. However, most current feature selection methods merely capture the local structure of the original data while ignoring the importance of the global structure of the original data. We believe that the global structure and local structure of the original data are equally important, and so the selected genes should maintain the essential structure of the original data as far as possible. In this paper, we propose a new, adaptive, unsupervised feature selection scheme which not only reconstructs high-dimensional data into a low-dimensional space with the constraint of feature distance invariance but also employs ℓ2,1-norm to enable a matrix with the ability to perform gene selection embedding into the local manifold structure-learning framework. Moreover, an effective algorithm is developed to solve the optimization problem based on the proposed scheme. Comparative experiments with some classical schemes on real tumor datasets demonstrate the effectiveness of the proposed method.

Download Full-text