Classification of Parkinson speech data by metric learning

Author(s):  
Mahmut Kaya ◽  
Hasan Sakir Bilge
Keyword(s):  
2021 ◽  
Vol 21 (S2) ◽  
Author(s):  
Kun Zeng ◽  
Yibin Xu ◽  
Ge Lin ◽  
Likeng Liang ◽  
Tianyong Hao

Abstract Background Eligibility criteria are the primary strategy for screening the target participants of a clinical trial. Automated classification of clinical trial eligibility criteria text by using machine learning methods improves recruitment efficiency to reduce the cost of clinical research. However, existing methods suffer from poor classification performance due to the complexity and imbalance of eligibility criteria text data. Methods An ensemble learning-based model with metric learning is proposed for eligibility criteria classification. The model integrates a set of pre-trained models including Bidirectional Encoder Representations from Transformers (BERT), A Robustly Optimized BERT Pretraining Approach (RoBERTa), XLNet, Pre-training Text Encoders as Discriminators Rather Than Generators (ELECTRA), and Enhanced Representation through Knowledge Integration (ERNIE). Focal Loss is used as a loss function to address the data imbalance problem. Metric learning is employed to train the embedding of each base model for feature distinguish. Soft Voting is applied to achieve final classification of the ensemble model. The dataset is from the standard evaluation task 3 of 5th China Health Information Processing Conference containing 38,341 eligibility criteria text in 44 categories. Results Our ensemble method had an accuracy of 0.8497, a precision of 0.8229, and a recall of 0.8216 on the dataset. The macro F1-score was 0.8169, outperforming state-of-the-art baseline methods by 0.84% improvement on average. In addition, the performance improvement had a p-value of 2.152e-07 with a standard t-test, indicating that our model achieved a significant improvement. Conclusions A model for classifying eligibility criteria text of clinical trials based on multi-model ensemble learning and metric learning was proposed. The experiments demonstrated that the classification performance was improved by our ensemble model significantly. In addition, metric learning was able to improve word embedding representation and the focal loss reduced the impact of data imbalance to model performance.


2002 ◽  
Vol 112 (5) ◽  
pp. 2304-2304
Author(s):  
John Ch. Goddard Close ◽  
Fabiola M. Martinez Licona ◽  
Alma E. Martinez Licona ◽  
H. Leonardo Rufiner
Keyword(s):  

2020 ◽  
Vol 12 (10) ◽  
pp. 1593
Author(s):  
Hongying Liu ◽  
Ruyi Luo ◽  
Fanhua Shang ◽  
Xuechun Meng ◽  
Shuiping Gou ◽  
...  

Recently, classification methods based on deep learning have attained sound results for the classification of Polarimetric synthetic aperture radar (PolSAR) data. However, they generally require a great deal of labeled data to train their models, which limits their potential real-world applications. This paper proposes a novel semi-supervised deep metric learning network (SSDMLN) for feature learning and classification of PolSAR data. Inspired by distance metric learning, we construct a network, which transforms the linear mapping of metric learning into the non-linear projection in the layer-by-layer learning. With the prior knowledge of the sample categories, the network also learns a distance metric under which all pairs of similarly labeled samples are closer and dissimilar samples have larger relative distances. Moreover, we introduce a new manifold regularization to reduce the distance between neighboring samples since they are more likely to be homogeneous. The categorizing is achieved by using a simple classifier. Several experiments on both synthetic and real-world PolSAR data from different sensors are conducted and they demonstrate the effectiveness of SSDMLN with limited labeled samples, and SSDMLN is superior to state-of-the-art methods.


2021 ◽  
Vol 11 (15) ◽  
pp. 6959
Author(s):  
Zaky Dzulfikri ◽  
Pin-Wei Su ◽  
Chih-Yung Huang

Stamping processes remain crucial in manufacturing processes; therefore, diagnosing the condition of stamping tools is critical. One of the challenges in diagnosing stamping tool conditions is that traditionally, the tools need to be visually checked, and the production processes thus need to be halted. With the development of Industry 4.0, intelligent monitoring systems have been developed by using accelerometers and algorithms to diagnose the wear classification of stamping tools. Although several deep learning models such as the convolutional neural network (CNN), auto encoder (AE), and recurrent neural network (RNN) models have demonstrated promising results for classifying complex signals including accelerometer signals, the practicality of those methods are restricted due to the flexibility of adding new classes and low accuracy when faced to low numbers of samples per class. In this study, we applied deep metric learning (DML) methods to overcome these problems. DML involves extracting meaningful features using feature extraction modules to map inputs into embedding features. We compared the probability method, the contrastive method, and a triplet network to determine which method was most suitable for our case. The experimental results revealed that, compared with other models, a triplet network can be more effectively trained with limited training data. The triplet network demonstrated the best test results of the compared methods in the noised test data. Finally, when tested using unseen class, the triplet network and the probability method demonstrated similar results.


Sign in / Sign up

Export Citation Format

Share Document