TWO-WAY METRIC LEARNING WITH MAJORITY AND MINORITY SUBSETS FOR CLASSIFICATION OF LARGE EXTREMELY IMBALANCED FACE DATASET

Automated classification of clinical trial eligibility criteria text based on ensemble learning and metric learning

BMC Medical Informatics and Decision Making ◽

10.1186/s12911-021-01492-z ◽

2021 ◽

Vol 21 (S2) ◽

Author(s):

Kun Zeng ◽

Yibin Xu ◽

Ge Lin ◽

Likeng Liang ◽

Tianyong Hao

Keyword(s):

Clinical Trial ◽

Ensemble Learning ◽

Metric Learning ◽

Classification Performance ◽

Ensemble Model ◽

Automated Classification ◽

Eligibility Criteria ◽

Data Imbalance ◽

The Impact

Abstract Background Eligibility criteria are the primary strategy for screening the target participants of a clinical trial. Automated classification of clinical trial eligibility criteria text by using machine learning methods improves recruitment efficiency to reduce the cost of clinical research. However, existing methods suffer from poor classification performance due to the complexity and imbalance of eligibility criteria text data. Methods An ensemble learning-based model with metric learning is proposed for eligibility criteria classification. The model integrates a set of pre-trained models including Bidirectional Encoder Representations from Transformers (BERT), A Robustly Optimized BERT Pretraining Approach (RoBERTa), XLNet, Pre-training Text Encoders as Discriminators Rather Than Generators (ELECTRA), and Enhanced Representation through Knowledge Integration (ERNIE). Focal Loss is used as a loss function to address the data imbalance problem. Metric learning is employed to train the embedding of each base model for feature distinguish. Soft Voting is applied to achieve final classification of the ensemble model. The dataset is from the standard evaluation task 3 of 5th China Health Information Processing Conference containing 38,341 eligibility criteria text in 44 categories. Results Our ensemble method had an accuracy of 0.8497, a precision of 0.8229, and a recall of 0.8216 on the dataset. The macro F1-score was 0.8169, outperforming state-of-the-art baseline methods by 0.84% improvement on average. In addition, the performance improvement had a p-value of 2.152e-07 with a standard t-test, indicating that our model achieved a significant improvement. Conclusions A model for classifying eligibility criteria text of clinical trials based on multi-model ensemble learning and metric learning was proposed. The experiments demonstrated that the classification performance was improved by our ensemble model significantly. In addition, metric learning was able to improve word embedding representation and the focal loss reduced the impact of data imbalance to model performance.

Download Full-text

Distance metric learning and support vector machines for classification of mass spectrometry proteomics data

International Journal of Knowledge Engineering and Soft Data Paradigms ◽

10.1504/ijkesdp.2009.028815 ◽

2009 ◽

Vol 1 (3) ◽

pp. 216 ◽

Cited By ~ 1

Author(s):

Qingzhong Liu ◽

Mengyu Qiao ◽

Andrew H. Sung

Keyword(s):

Mass Spectrometry ◽

Support Vector Machines ◽

Metric Learning ◽

Support Vector ◽

Distance Metric Learning ◽

Distance Metric ◽

Proteomics Data ◽

Vector Machines

Download Full-text

Dimensionality Reduction and Classification of Hyperspectral Images Using Ensemble Discriminative Local Metric Learning

IEEE Transactions on Geoscience and Remote Sensing ◽

10.1109/tgrs.2016.2645703 ◽

2017 ◽

Vol 55 (5) ◽

pp. 2509-2524 ◽

Cited By ~ 59

Author(s):

Yanni Dong ◽

Bo Du ◽

Liangpei Zhang ◽

Lefei Zhang

Keyword(s):

Dimensionality Reduction ◽

Metric Learning ◽

Hyperspectral Images

Download Full-text

Semi-Supervised Deep Metric Learning Networks for Classification of Polarimetric SAR Data

Remote Sensing ◽

10.3390/rs12101593 ◽

2020 ◽

Vol 12 (10) ◽

pp. 1593

Author(s):

Hongying Liu ◽

Ruyi Luo ◽

Fanhua Shang ◽

Xuechun Meng ◽

Shuiping Gou ◽

...

Keyword(s):

Real World ◽

Metric Learning ◽

Feature Learning ◽

Linear Mapping ◽

Layer By Layer ◽

Learning Networks ◽

Distance Metric ◽

Linear Projection ◽

Deep Metric Learning

Recently, classification methods based on deep learning have attained sound results for the classification of Polarimetric synthetic aperture radar (PolSAR) data. However, they generally require a great deal of labeled data to train their models, which limits their potential real-world applications. This paper proposes a novel semi-supervised deep metric learning network (SSDMLN) for feature learning and classification of PolSAR data. Inspired by distance metric learning, we construct a network, which transforms the linear mapping of metric learning into the non-linear projection in the layer-by-layer learning. With the prior knowledge of the sample categories, the network also learns a distance metric under which all pairs of similarly labeled samples are closer and dissimilar samples have larger relative distances. Moreover, we introduce a new manifold regularization to reduce the distance between neighboring samples since they are more likely to be homogeneous. The categorizing is achieved by using a simple classifier. Several experiments on both synthetic and real-world PolSAR data from different sensors are conducted and they demonstrate the effectiveness of SSDMLN with limited labeled samples, and SSDMLN is superior to state-of-the-art methods.

Download Full-text

Classification of Parkinson speech data by metric learning

2017 International Artificial Intelligence and Data Processing Symposium (IDAP) ◽

10.1109/idap.2017.8090285 ◽

2017 ◽

Author(s):

Mahmut Kaya ◽

Hasan Sakir Bilge

Keyword(s):

Metric Learning ◽

Speech Data

Download Full-text

Stamping Tool Conditions Diagnosis: A Deep Metric Learning Approach

Applied Sciences ◽

10.3390/app11156959 ◽

2021 ◽

Vol 11 (15) ◽

pp. 6959

Author(s):

Zaky Dzulfikri ◽

Pin-Wei Su ◽

Chih-Yung Huang

Keyword(s):

Neural Network ◽

Metric Learning ◽

Training Data ◽

Probability Method ◽

Learning Approach ◽

Test Results ◽

Complex Signals ◽

Intelligent Monitoring ◽

Deep Metric Learning

Stamping processes remain crucial in manufacturing processes; therefore, diagnosing the condition of stamping tools is critical. One of the challenges in diagnosing stamping tool conditions is that traditionally, the tools need to be visually checked, and the production processes thus need to be halted. With the development of Industry 4.0, intelligent monitoring systems have been developed by using accelerometers and algorithms to diagnose the wear classification of stamping tools. Although several deep learning models such as the convolutional neural network (CNN), auto encoder (AE), and recurrent neural network (RNN) models have demonstrated promising results for classifying complex signals including accelerometer signals, the practicality of those methods are restricted due to the flexibility of adding new classes and low accuracy when faced to low numbers of samples per class. In this study, we applied deep metric learning (DML) methods to overcome these problems. DML involves extracting meaningful features using feature extraction modules to map inputs into embedding features. We compared the probability method, the contrastive method, and a triplet network to determine which method was most suitable for our case. The experimental results revealed that, compared with other models, a triplet network can be more effectively trained with limited training data. The triplet network demonstrated the best test results of the compared methods in the noised test data. Finally, when tested using unseen class, the triplet network and the probability method demonstrated similar results.

Download Full-text

P‐2.3: The Classification of Panel Defects with a Few Samples Based on Metric Learning

SID Symposium Digest of Technical Papers ◽

10.1002/sdtp.14522 ◽

2021 ◽

Vol 52 (S1) ◽

pp. 467-467

Author(s):

Chunxu Chen ◽

Shengsen Zhang

Keyword(s):

Metric Learning

Download Full-text

Deep Metric Learning Network using Proxies for Chromosome Classification and Retrieval in Karyotyping Test

10.1101/2020.05.24.113936 ◽

2020 ◽

Author(s):

Hwejin Jung ◽

Bogyu Park ◽

Sangmun Lee ◽

Seungwoo Hyun ◽

Jinah Lee ◽

...

Keyword(s):

Metric Learning ◽

Careful Analysis ◽

Deep Convolutional Neural Networks ◽

Learning Network ◽

Backbone Network ◽

Network Training ◽

Linear Layer ◽

Chromosome Classification ◽

Fully Connected

AbstractIn karyotyping, the classification of chromosomes is a tedious, complicated, and time-consuming process. It requires extremely careful analysis of chromosomes by well-trained cytogeneticists. To assist cytogeneticists in karyotyping, we introduce Proxy-ResNeXt-CBAM which is a metric learning based network using proxies with a convolutional block attention module (CBAM) designed for chromosome classification. RexNeXt-50 is used as a backbone network. To apply metric learning, the fully connected linear layer of the backbone network (ResNeXt-50) is removed and is replaced with CBAM. The similarity between embeddings, which are the outputs of the metric learning network, and proxies are measured for network training.Proxy-ResNeXt-CBAM is validated on a public chromosome image dataset, and it achieves an accuracy of 95.86%, a precision of 95.87%, a recall of 95.9%, and an F-1 score of 95.79%. Proxy-ResNeXt-CBAM which is the metric learning network using proxies outperforms the baseline networks. In addition, the results of our embedding analysis demonstrate the effectiveness of using proxies in metric learning for optimizing deep convolutional neural networks. As the embedding analysis results show, Proxy-ResNeXt-CBAM obtains a 94.78% Recall@1 in image retrieval, and the embeddings of each chromosome are well clustered according to their similarity.

Download Full-text

Kernel Regression based Sparse Metric Learning for Extensive Classification of Visual Art Images

10.1109/icses52305.2021.9633776 ◽

2021 ◽

Author(s):

D N S Ravi Kumar ◽

G T Sundarrajan ◽

S D Sundarsingh Jebaseelan ◽

M. Pushpavalli ◽

A Rameshbabu ◽

...

Keyword(s):

Visual Art ◽

Metric Learning ◽

Kernel Regression

Download Full-text

Adaptive Metric Learning Vector Quantization for Ordinal Classification

Neural Computation ◽

10.1162/neco_a_00358 ◽

2012 ◽

Vol 24 (11) ◽

pp. 2825-2851 ◽

Cited By ~ 12

Author(s):

Shereen Fouad ◽

Peter Tino

Keyword(s):

Vector Quantization ◽

Regression Models ◽

Detrimental Effect ◽

Pattern Analysis ◽

Metric Learning ◽

Computational Cost ◽

Learning Vector Quantization ◽

Nonlinear Classification ◽

Nominal Classification

Many pattern analysis problems require classification of examples into naturally ordered classes. In such cases, nominal classification schemes will ignore the class order relationships, which can have a detrimental effect on classification accuracy. This article introduces two novel ordinal learning vector quantization (LVQ) schemes, with metric learning, specifically designed for classifying data items into ordered classes. In ordinal LVQ, unlike in nominal LVQ, the class order information is used during training in selecting the class prototypes to be adapted, as well as in determining the exact manner in which the prototypes get updated. Prototype-based models in general are more amenable to interpretations and can often be constructed at a smaller computational cost than alternative nonlinear classification models. Experiments demonstrate that the proposed ordinal LVQ formulations compare favorably with their nominal counterparts. Moreover, our methods achieve competitive performance against existing benchmark ordinal regression models.

Download Full-text