Classification of Parkinson speech data by metric learning

Abstract Background Eligibility criteria are the primary strategy for screening the target participants of a clinical trial. Automated classification of clinical trial eligibility criteria text by using machine learning methods improves recruitment efficiency to reduce the cost of clinical research. However, existing methods suffer from poor classification performance due to the complexity and imbalance of eligibility criteria text data. Methods An ensemble learning-based model with metric learning is proposed for eligibility criteria classification. The model integrates a set of pre-trained models including Bidirectional Encoder Representations from Transformers (BERT), A Robustly Optimized BERT Pretraining Approach (RoBERTa), XLNet, Pre-training Text Encoders as Discriminators Rather Than Generators (ELECTRA), and Enhanced Representation through Knowledge Integration (ERNIE). Focal Loss is used as a loss function to address the data imbalance problem. Metric learning is employed to train the embedding of each base model for feature distinguish. Soft Voting is applied to achieve final classification of the ensemble model. The dataset is from the standard evaluation task 3 of 5th China Health Information Processing Conference containing 38,341 eligibility criteria text in 44 categories. Results Our ensemble method had an accuracy of 0.8497, a precision of 0.8229, and a recall of 0.8216 on the dataset. The macro F1-score was 0.8169, outperforming state-of-the-art baseline methods by 0.84% improvement on average. In addition, the performance improvement had a p-value of 2.152e-07 with a standard t-test, indicating that our model achieved a significant improvement. Conclusions A model for classifying eligibility criteria text of clinical trials based on multi-model ensemble learning and metric learning was proposed. The experiments demonstrated that the classification performance was improved by our ensemble model significantly. In addition, metric learning was able to improve word embedding representation and the focal loss reduced the impact of data imbalance to model performance.

Download Full-text

Classification of Diagnosis of Alzheimer’s Disease Based on Convolutional Layers of VGG16 Model using Speech Data

2020 International Conference on Information and Communication Technology Convergence (ICTC) ◽

10.1109/ictc49870.2020.9289477 ◽

2020 ◽

Author(s):

Minwoo Kim ◽

Hyungjun Kim ◽

Joon S. Lim

Keyword(s):

Alzheimer’S Disease ◽

Alzheimer's Disease ◽

Speech Data

Download Full-text

String kernels for the classification of speech data

The Journal of the Acoustical Society of America ◽

10.1121/1.4779275 ◽

2002 ◽

Vol 112 (5) ◽

pp. 2304-2304

Author(s):

John Ch. Goddard Close ◽

Fabiola M. Martinez Licona ◽

Alma E. Martinez Licona ◽

H. Leonardo Rufiner

Keyword(s):

String Kernels ◽

Speech Data

Download Full-text

Distance metric learning and support vector machines for classification of mass spectrometry proteomics data

International Journal of Knowledge Engineering and Soft Data Paradigms ◽

10.1504/ijkesdp.2009.028815 ◽

2009 ◽

Vol 1 (3) ◽

pp. 216 ◽

Cited By ~ 1

Author(s):

Qingzhong Liu ◽

Mengyu Qiao ◽

Andrew H. Sung

Keyword(s):

Mass Spectrometry ◽

Support Vector Machines ◽

Metric Learning ◽

Support Vector ◽

Distance Metric Learning ◽

Distance Metric ◽

Proteomics Data ◽

Vector Machines

Download Full-text

Dimensionality Reduction and Classification of Hyperspectral Images Using Ensemble Discriminative Local Metric Learning

IEEE Transactions on Geoscience and Remote Sensing ◽

10.1109/tgrs.2016.2645703 ◽

2017 ◽

Vol 55 (5) ◽

pp. 2509-2524 ◽

Cited By ~ 59

Author(s):

Yanni Dong ◽

Bo Du ◽

Liangpei Zhang ◽

Lefei Zhang

Keyword(s):

Dimensionality Reduction ◽

Metric Learning ◽

Hyperspectral Images

Download Full-text

TWO-WAY METRIC LEARNING WITH MAJORITY AND MINORITY SUBSETS FOR CLASSIFICATION OF LARGE EXTREMELY IMBALANCED FACE DATASET

Jordanian Journal of Computers and Information Technology ◽

10.5455/jjcit.71-1626417940 ◽

2021 ◽

pp. 1

Author(s):

Ashu Kaushik ◽

Seba Susan

Keyword(s):

Metric Learning

Download Full-text

Semi-Supervised Deep Metric Learning Networks for Classification of Polarimetric SAR Data

Remote Sensing ◽

10.3390/rs12101593 ◽

2020 ◽

Vol 12 (10) ◽

pp. 1593

Author(s):

Hongying Liu ◽

Ruyi Luo ◽

Fanhua Shang ◽

Xuechun Meng ◽

Shuiping Gou ◽

...

Keyword(s):

Real World ◽

Metric Learning ◽

Feature Learning ◽

Linear Mapping ◽

Layer By Layer ◽

Learning Networks ◽

Distance Metric ◽

Linear Projection ◽

Deep Metric Learning

Recently, classification methods based on deep learning have attained sound results for the classification of Polarimetric synthetic aperture radar (PolSAR) data. However, they generally require a great deal of labeled data to train their models, which limits their potential real-world applications. This paper proposes a novel semi-supervised deep metric learning network (SSDMLN) for feature learning and classification of PolSAR data. Inspired by distance metric learning, we construct a network, which transforms the linear mapping of metric learning into the non-linear projection in the layer-by-layer learning. With the prior knowledge of the sample categories, the network also learns a distance metric under which all pairs of similarly labeled samples are closer and dissimilar samples have larger relative distances. Moreover, we introduce a new manifold regularization to reduce the distance between neighboring samples since they are more likely to be homogeneous. The categorizing is achieved by using a simple classifier. Several experiments on both synthetic and real-world PolSAR data from different sensors are conducted and they demonstrate the effectiveness of SSDMLN with limited labeled samples, and SSDMLN is superior to state-of-the-art methods.

Download Full-text

Application of a GA/Bayesian Filter-Wrapper Feature Selection Method to Classification of Clinical Depression from Speech Data

Advances in Soft Computing - Soft Computing in Industrial Applications ◽

10.1007/978-3-540-70706-6_11 ◽

2007 ◽

pp. 115-121 ◽

Cited By ~ 4

Author(s):

Juan Torres ◽

Ashraf Saad ◽

Elliot Moore

Keyword(s):

Feature Selection ◽

Feature Selection Method ◽

Selection Method ◽

Clinical Depression ◽

Bayesian Filter ◽

Speech Data ◽

Wrapper Feature Selection

Download Full-text

Stamping Tool Conditions Diagnosis: A Deep Metric Learning Approach

Applied Sciences ◽

10.3390/app11156959 ◽

2021 ◽

Vol 11 (15) ◽

pp. 6959

Author(s):

Zaky Dzulfikri ◽

Pin-Wei Su ◽

Chih-Yung Huang

Keyword(s):

Neural Network ◽

Metric Learning ◽

Training Data ◽

Probability Method ◽

Learning Approach ◽

Test Results ◽

Complex Signals ◽

Intelligent Monitoring ◽

Deep Metric Learning

Stamping processes remain crucial in manufacturing processes; therefore, diagnosing the condition of stamping tools is critical. One of the challenges in diagnosing stamping tool conditions is that traditionally, the tools need to be visually checked, and the production processes thus need to be halted. With the development of Industry 4.0, intelligent monitoring systems have been developed by using accelerometers and algorithms to diagnose the wear classification of stamping tools. Although several deep learning models such as the convolutional neural network (CNN), auto encoder (AE), and recurrent neural network (RNN) models have demonstrated promising results for classifying complex signals including accelerometer signals, the practicality of those methods are restricted due to the flexibility of adding new classes and low accuracy when faced to low numbers of samples per class. In this study, we applied deep metric learning (DML) methods to overcome these problems. DML involves extracting meaningful features using feature extraction modules to map inputs into embedding features. We compared the probability method, the contrastive method, and a triplet network to determine which method was most suitable for our case. The experimental results revealed that, compared with other models, a triplet network can be more effectively trained with limited training data. The triplet network demonstrated the best test results of the compared methods in the noised test data. Finally, when tested using unseen class, the triplet network and the probability method demonstrated similar results.

Download Full-text

P‐2.3: The Classification of Panel Defects with a Few Samples Based on Metric Learning

SID Symposium Digest of Technical Papers ◽

10.1002/sdtp.14522 ◽

2021 ◽

Vol 52 (S1) ◽

pp. 467-467

Author(s):

Chunxu Chen ◽

Shengsen Zhang

Keyword(s):

Metric Learning

Download Full-text