feature group Latest Research Papers

Categorical Exploratory Data Analysis: From Multiclass Classification and Response Manifold Analytics Perspectives of Baseball Pitching Dynamics

Entropy ◽

10.3390/e23070792 ◽

2021 ◽

Vol 23 (7) ◽

pp. 792

Author(s):

Fushing Hsieh ◽

Elizabeth P. Chou

Keyword(s):

Data Analysis ◽

Exploratory Data Analysis ◽

Distance Measure ◽

Point Clouds ◽

Multiclass Classification ◽

Multiple Response ◽

Feature Group ◽

Information Contents ◽

Exploratory Data ◽

A Chain

All features of any data type are universally equipped with categorical nature revealed through histograms. A contingency table framed by two histograms affords directional and mutual associations based on rescaled conditional Shannon entropies for any feature-pair. The heatmap of the mutual association matrix of all features becomes a roadmap showing which features are highly associative with which features. We develop our data analysis paradigm called categorical exploratory data analysis (CEDA) with this heatmap as a foundation. CEDA is demonstrated to provide new resolutions for two topics: multiclass classification (MCC) with one single categorical response variable and response manifold analytics (RMA) with multiple response variables. We compute visible and explainable information contents with multiscale and heterogeneous deterministic and stochastic structures in both topics. MCC involves all feature-group specific mixing geometries of labeled high-dimensional point-clouds. Upon each identified feature-group, we devise an indirect distance measure, a robust label embedding tree (LET), and a series of tree-based binary competitions to discover and present asymmetric mixing geometries. Then, a chain of complementary feature-groups offers a collection of mixing geometric pattern-categories with multiple perspective views. RMA studies a system’s regulating principles via multiple dimensional manifolds jointly constituted by targeted multiple response features and selected major covariate features. This manifold is marked with categorical localities reflecting major effects. Diverse minor effects are checked and identified across all localities for heterogeneity. Both MCC and RMA information contents are computed for data’s information content with predictive inferences as by-products. We illustrate CEDA developments via Iris data and demonstrate its applications on data taken from the PITCHf/x database.

Predicting the prognosis of hepatocellular carcinoma with the treatment of transcatheter arterial chemoembolization combined with microwave ablation using pretreatment MR imaging texture features

Abdominal Radiology ◽

10.1007/s00261-020-02891-y ◽

2021 ◽

Author(s):

Jun Liu ◽

Yigang Pei ◽

Yu Zhang ◽

Yifan Wu ◽

Fuquan Liu ◽

...

Keyword(s):

Microwave Ablation ◽

Proportional Hazards ◽

Transcatheter Arterial Chemoembolization ◽

Texture Features ◽

Cox Proportional Hazards ◽

Free Survival ◽

Feature Group ◽

Kaplan Meier ◽

Optimal Feature ◽

Arterial Chemoembolization

Abstract Objective To investigate the prognostic value of baseline magnetic resonance imaging (MRI) texture analysis of hepatocellular carcinoma (HCC) treated with transcatheter arterial chemoembolization (TACE) and microwave ablation (MWA). Methods MRI was performed on 102 patients with HCC before receiving TACE combined with MWA in this retrospective study. The best 10 texture features were screened as a feature group for each MRI sequence by MaZda software using mutual information coefficient (MI), nonlinear discriminant analysis (NDA) and other methods. The optimal feature group with the lowest misdiagnosis rate was achieved on one MRI sequence between two groups dichotomized by 3-year survival, which was used to optimize the significant texture features with the optimal cutoff values. The Cox proportional hazards model was generated for the significant texture features and clinical variables to determine the independent predictors of overall survival (OS). The predictive performance of the model was further evaluated by the area under the ROC curve (AUC). Kaplan–Meier and log-rank tests were performed for disease-free survival (DFS) and Local recurrence-free survival (LRFS). Results The optimal feature group with the lowest misdiagnosis rate of 8.82% was obtained on T2WI using MI combined with NDA feature analysis. For Cox proportional hazards regression models, the independent prognostic factors associated with OS were albumin (P = 0.047), BCLC stage (P = 0.001), Correlat(1,− 1)T2 (P = 0.01) and SumEntrp(3,0)T2 (P = 0.015), and the prediction efficiency of multivariate model is AUC = 0.876, 95%CI = 0.803–0.949. Kaplan–Meier analyses further demonstrated that BCLC (P < 0.001), Correlat(1,− 1)T2 (P = 0.023) and SumEntrp(3,0)T2 (P < 0.001) were associated with DFS, and BCLC (P = 0.007) related to LRFS. Conclusions MR imaging texture features may be used to predict the prognosis of HCC treated with TACE combined with MWA.

Feature Group Importance for Automated Essay Scoring

Lecture Notes in Computer Science - Multi-disciplinary Trends in Artificial Intelligence ◽

10.1007/978-3-030-80253-0_6 ◽

2021 ◽

pp. 58-70

Author(s):

Jih Soong Tan ◽

Ian K. T. Tan

Keyword(s):

Automated Essay Scoring ◽

Feature Group ◽

Essay Scoring

Flow Regime Identification in Vertical Upward Gas–Liquid Flow Using an Optical Sensor With Linear and Quadratic Discriminant Analysis

Journal of Fluids Engineering ◽

10.1115/1.4048613 ◽

2020 ◽

Vol 143 (2) ◽

Author(s):

Kwame Sarkodie ◽

Andrew Fergusson-Rees

Keyword(s):

Discriminant Analysis ◽

Flow Regime ◽

Liquid Flow ◽

Optical Sensor ◽

Flow Regimes ◽

Sensor Response ◽

Quadratic Discriminant Analysis ◽

Feature Group ◽

Flow Regime Identification ◽

Gas Liquid Flow

Abstract The accurate identification of gas–liquid flow regimes in pipes remains a challenge for the chemical process industries. This paper proposes a method for flow regime identification that combines responses from a nonintrusive optical sensor with linear discriminant analysis (LDA) and quadratic discriminant analysis (QDA) for vertical upward gas–liquid flow of air and water. A total of 165 flow conditions make up the dataset, collected from an experimental air–water flow loop with a transparent test section (TS) of 27.3 mm inner diameter and 5 m length. Selected features extracted from the sensor response are categorized into feature group 1, average sensor response and standard deviation, and feature group 2 that also includes percentage counts of the calibrated responses for water and air. The selected features are used to train, cross validate, and test four model cases (LDA1, LDA2, QDA1, and QDA2). The LDA models produce higher average test classification accuracies (both 95%) than the QDA models (80% QDA1 and 45% QDA2) due to misclassification associated with the slug and churn flow regimes. Results suggest that the LDA1 model case is the most stable with the lowest average performance loss and is therefore considered superior for flow regime identification. In future studies, a larger dataset may improve stability and accuracy of the QDA models, and an extension of the conditions and parameters would be a useful test of applicability.

Relieff Matching Feature Selection for Emotion Recognition Based on EEG Signal

10.21203/rs.3.rs-80078/v1 ◽

2020 ◽

Author(s):

Xiaodan Zhang ◽

Tao Li ◽

Yichong She ◽

Rui Zhao ◽

Jinxiang Du ◽

...

Keyword(s):

Feature Selection ◽

Emotion Recognition ◽

Wavelet Coefficient ◽

Wavelet Packet ◽

Single Subject ◽

Feature Group ◽

Group Data ◽

Global Threshold ◽

Reconstructed Signal ◽

Individual Specificity

Abstract ReliefF Matching Feature Selection (RMFS) is proposed in the paper, which can solve the problem of individual specificity and global threshold mismatch of emotion recognition. Firstly, EEG was decomposed into six emotion-related bands by wavelet packet, then EMD was employed for extracting the 10 categories of features of wavelet coefficient and IMF component of the reconstructed signal; Secondly, the optimization formula of the feature group weight was proposed based on feature sets selected by ReliefF, and it can get the weights of different test features, which were the global optimal matching feature group and the corresponding matching channel, so it can eliminate the redundant information and solve the problem of individual specificity. Finally, SVM was employed to identify the test feature group data to obtain emotional recognition results. The experimental results show that the average correct rates of RMFS for two-category of the valence and the arousal are 93.28% and 93.32%, and the four-categories are higher than 83%. The efficiency of the single subject using RMFS is improved by 42.65%, which is better than the traditional ReliefF algorithm.

A Framework for Feature Selection to Exploit Feature Group Structures

Advances in Knowledge Discovery and Data Mining - Lecture Notes in Computer Science ◽

10.1007/978-3-030-47426-3_61 ◽

2020 ◽

pp. 792-804

Author(s):

Kushani Perera ◽

Jeffrey Chan ◽

Shanika Karunasekera

Keyword(s):

Feature Selection ◽

Feature Group ◽

Group Structures

Automatic feature group combination selection method based on GA for the functional regions clustering in DBS

Computer Methods and Programs in Biomedicine ◽

10.1016/j.cmpb.2019.105091 ◽

2020 ◽

Vol 183 ◽

pp. 105091 ◽

Cited By ~ 2

Author(s):

Lei Cao ◽

Jie Li ◽

Yuanyuan Zhou ◽

Yunhui Liu ◽

Hao Liu

Keyword(s):

Selection Method ◽

Functional Regions ◽

Feature Group ◽

Combination Selection

Research on Cognitive Matching of Biological Morphological Features and Images for Profiling Design

E3S Web of Conferences ◽

10.1051/e3sconf/202017901015 ◽

2020 ◽

Vol 179 ◽

pp. 01015

Author(s):

Bin Zhou ◽

Li Lin

Keyword(s):

Response Time ◽

Product Form ◽

Time Data ◽

Bionic Design ◽

Feature Group ◽

Biological Form ◽

Compatibility Group ◽

Form Features ◽

Cognitive Measurement ◽

Shape Characteristics

To obtain high-quality bionic design scheme of product form, this paper explores the matching relationship between users’ biological form features and their images from the level of implicit cognition, providing objective basis for effective selection of ideographic biological form features in bionic design of product form. The eye movement experiment was used to screen the biomorphic feature group that was focused on. Questionnaire survey and cluster analysis were used to obtain the main image phrases of the morphological feature group. The two collected materials were combined with implicit cognitive measurement (IAT) to obtain the response time data of the subjects in the classification task. According to Greenwald’s method to verify the effectiveness of the data as a whole, the response time of the combination of various features and images in the compatibility group is sorted to obtain the design guidance conclusion. Taking the white shouldered eagle as an example, the experimental data showed high validity by t-test, and the implicit effect value of the compatibility group was 0.68. According to the analysis of the data, the main image that most matches the head shape characteristics of the white shouldered eagle is “Ferocious”, and the main image that most matches the wing shape characteristics is “Lightsome”, and there is no difference in the implicit cognitive attitude between men and women. The designer takes this as the design reference to improve the effectiveness of the design output. This study can provide more objective suggestions for the bionic design of the related product shape.

Latent Feature Group Learning for High-Dimensional Data Clustering

Information ◽

10.3390/info10060208 ◽

2019 ◽

Vol 10 (6) ◽

pp. 208 ◽

Cited By ~ 1

Author(s):

Wenting Wang ◽

Yulin He ◽

Liheng Ma ◽

Joshua Zhexue Huang

Keyword(s):

Data Clustering ◽

Distance Measure ◽

Fitness Function ◽

Group Learning ◽

High Dimensional Data ◽

Nonnegative Matrix ◽

High Dimensional ◽

Factorization Problem ◽

Feature Group ◽

Feature Grouping

In this paper, we propose a latent feature group learning (LFGL) algorithm to discover the feature grouping structures and subspace clusters for high-dimensional data. The feature grouping structures, which are learned in an analytical way, can enhance the accuracy and efficiency of high-dimensional data clustering. In LFGL algorithm, the Darwinian evolutionary process is used to explore the optimal feature grouping structures, which are coded as chromosomes in the genetic algorithm. The feature grouping weighting k-means algorithm is used as the fitness function to evaluate the chromosomes or feature grouping structures in each generation of evolution. To better handle the diverse densities of clusters in high-dimensional data, the original feature grouping weighting k-means is revised with the mass-based dissimilarity measure rather than the Euclidean distance measure and the feature weights are optimized as a nonnegative matrix factorization problem under the orthogonal constraint of feature weight matrix. The genetic operations of mutation and crossover are used to generate the new chromosomes for next generation. In comparison with the well-known clustering algorithms, LFGL algorithm produced encouraging experimental results on real world datasets, which demonstrated the better performance of LFGL when clustering high-dimensional data.

An Approach to Discovering Product/Service Improvement Priorities: Using Dynamic Importance-Performance Analysis

Sustainability ◽

10.3390/su10103564 ◽

2018 ◽

Vol 10 (10) ◽

pp. 3564 ◽

Cited By ~ 5

Author(s):

Jiacong Wu ◽

Yu Wang ◽

Ru Zhang ◽

Jing Cai

Keyword(s):

Performance Analysis ◽

Customer Satisfaction ◽

Market Competition ◽

Service Improvement ◽

Target Product ◽

Customer Reviews ◽

Review Mining ◽

Feature Group ◽

Online Customer Reviews ◽

Product Service

The cost budget and resources of a business are limited. In order to be competitive sustainably in the market, it is necessary for a businesses to discover the improvement priorities of their product/service features effectively and allocate their resources appropriately for higher customer satisfaction. Online customer review mining has been attracting increasing attention for businesses to discover priorities of product/service improvement from online customer reviews. Despite some prior related studies, their methods have several limitations, such as simply using the frequencies of mentioned product features in reviews as an indicator of importance; neglecting the market competition; and focusing only on the static importance and performance of the target product/service features. To address those limitations, this study proposes a novel approach to discovering a product/service’s improvement priorities through dynamic importance-performance analysis of online customer reviews. It first clusters similar features into a feature group and calculate the relative performance of the feature groups using sentiment analysis. Next, the importance of each feature group’s performance to overall customer satisfaction is measured by the factor categories based on the Kano’s model. The factor categories are determined by the significance values of each feature group in both positive and negative sentiment polarities derived from the constructed decision tree. Finally, feature improvement priorities of a target product/service will be discovered based on the dynamic performance trend and predicted importance using a dynamic importance-performance analysis. The evaluation results show that the dynamic importance-performance analysis approach proposed in this study is a much better approach for product/service improvement priorities discovering than the product opportunity mining approach proposed in the prior studies. This study makes new research contributions to automatic discovery of product/service improvement priorities from large-scale online customer reviews. The proposed approach can also be used for product/service performance monitoring and customer needs analysis to improve product/service design and marketing campaigns.

feature group
Recently Published Documents

TOTAL DOCUMENTS

H-INDEX

Categorical Exploratory Data Analysis: From Multiclass Classification and Response Manifold Analytics Perspectives of Baseball Pitching Dynamics

Predicting the prognosis of hepatocellular carcinoma with the treatment of transcatheter arterial chemoembolization combined with microwave ablation using pretreatment MR imaging texture features

Feature Group Importance for Automated Essay Scoring

Flow Regime Identification in Vertical Upward Gas–Liquid Flow Using an Optical Sensor With Linear and Quadratic Discriminant Analysis

Relieff Matching Feature Selection for Emotion Recognition Based on EEG Signal

A Framework for Feature Selection to Exploit Feature Group Structures

Automatic feature group combination selection method based on GA for the functional regions clustering in DBS

Research on Cognitive Matching of Biological Morphological Features and Images for Profiling Design

Latent Feature Group Learning for High-Dimensional Data Clustering

An Approach to Discovering Product/Service Improvement Priorities: Using Dynamic Importance-Performance Analysis

Export Citation Format

feature groupRecently Published Documents

TOTAL DOCUMENTS

H-INDEX

Categorical Exploratory Data Analysis: From Multiclass Classification and Response Manifold Analytics Perspectives of Baseball Pitching Dynamics

Predicting the prognosis of hepatocellular carcinoma with the treatment of transcatheter arterial chemoembolization combined with microwave ablation using pretreatment MR imaging texture features

Feature Group Importance for Automated Essay Scoring

Flow Regime Identification in Vertical Upward Gas–Liquid Flow Using an Optical Sensor With Linear and Quadratic Discriminant Analysis

Relieff Matching Feature Selection for Emotion Recognition Based on EEG Signal

A Framework for Feature Selection to Exploit Feature Group Structures

Automatic feature group combination selection method based on GA for the functional regions clustering in DBS

Research on Cognitive Matching of Biological Morphological Features and Images for Profiling Design

Latent Feature Group Learning for High-Dimensional Data Clustering

An Approach to Discovering Product/Service Improvement Priorities: Using Dynamic Importance-Performance Analysis

feature group
Recently Published Documents