imbalanced training data
Recently Published Documents

TOTAL DOCUMENTS

21

(FIVE YEARS 9)

H-INDEX

4

(FIVE YEARS 1)

Latest Documents Most Cited Documents Contributed Authors Related Sources Related Keywords

Semi-supervised learning for medical image classification using imbalanced training data

Computer Methods and Programs in Biomedicine ◽

10.1016/j.cmpb.2022.106628 ◽

2022 ◽

pp. 106628

Author(s):

Tri Huynh ◽

Aiden Nibali ◽

Zhen He

Keyword(s):

Image Classification ◽

Supervised Learning ◽

Medical Image ◽

Training Data ◽

Medical Image Classification ◽

Imbalanced Training Data

Download Full-text

A binary PSO-based ensemble under-sampling model for rebalancing imbalanced training data

The Journal of Supercomputing ◽

10.1007/s11227-021-04177-6 ◽

2021 ◽

Author(s):

Jinyan Li ◽

Yaoyang Wu ◽

Simon Fong ◽

Antonio J. Tallón-Ballesteros ◽

Xin-she Yang ◽

...

Keyword(s):

Training Data ◽

Under Sampling ◽

Imbalanced Training Data ◽

Download Full-text

A Few-Shot Learning-Based Siamese Capsule Network for Intrusion Detection with Imbalanced Training Data

Computational Intelligence and Neuroscience ◽

10.1155/2021/7126913 ◽

2021 ◽

Vol 2021 ◽

pp. 1-17

Author(s):

Zu-Min Wang ◽

Ji-Yu Tian ◽

Jing Qin ◽

Hui Fang ◽

Li-Ming Chen

Keyword(s):

Intrusion Detection ◽

Detection System ◽

Metric Learning ◽

Training Data ◽

Superior Performance ◽

Sampling Scheme ◽

Network Intrusion Detection ◽

Network Intrusion ◽

Imbalanced Training Data ◽

Dynamic Relationships

Network intrusion detection remains one of the major challenges in cybersecurity. In recent years, many machine-learning-based methods have been designed to capture the dynamic and complex intrusion patterns to improve the performance of intrusion detection systems. However, two issues, including imbalanced training data and new unknown attacks, still hinder the development of a reliable network intrusion detection system. In this paper, we propose a novel few-shot learning-based Siamese capsule network to tackle the scarcity of abnormal network traffic training data and enhance the detection of unknown attacks. In specific, the well-designed deep learning network excels at capturing dynamic relationships across traffic features. In addition, an unsupervised subtype sampling scheme is seamlessly integrated with the Siamese network to improve the detection of network intrusion attacks under the circumstance of imbalanced training data. Experimental results have demonstrated that the metric learning framework is more suitable to extract subtle and distinctive features to identify both known and unknown attacks after the sampling scheme compared to other supervised learning methods. Compared to the state-of-the-art methods, our proposed method achieves superior performance to effectively detect both types of attacks.

Download Full-text

Phonetic Variation Modeling and a Language Model Adaptation for Korean English Code-Switching Speech Recognition

Applied Sciences ◽

10.3390/app11062866 ◽

2021 ◽

Vol 11 (6) ◽

pp. 2866

Author(s):

Damheo Lee ◽

Donghyun Kim ◽

Seung Yun ◽

Sanghun Kim

Keyword(s):

Speech Recognition ◽

Language Model ◽

Reduction Rate ◽

Code Switching ◽

Training Data ◽

Target Domain ◽

Phonetic Variation ◽

Language Model Adaptation ◽

Imbalanced Training Data ◽

In this paper, we propose a new method for code-switching (CS) automatic speech recognition (ASR) in Korean. First, the phonetic variations in English pronunciation spoken by Korean speakers should be considered. Thus, we tried to find a unified pronunciation model based on phonetic knowledge and deep learning. Second, we extracted the CS sentences semantically similar to the target domain and then applied the language model (LM) adaptation to solve the biased modeling toward Korean due to the imbalanced training data. In this experiment, training data were AI Hub (1033 h) in Korean and Librispeech (960 h) in English. As a result, when compared to the baseline, the proposed method improved the error reduction rate (ERR) by up to 11.6% with phonetic variant modeling and by 17.3% when semantically similar sentences were applied to the LM adaptation. If we considered only English words, the word correction rate improved up to 24.2% compared to that of the baseline. The proposed method seems to be very effective in CS speech recognition.

Download Full-text

Dual Loss for Manga Character Recognition with Imbalanced Training Data

2020 25th International Conference on Pattern Recognition (ICPR) ◽

10.1109/icpr48806.2021.9412282 ◽

2021 ◽

Author(s):

Yonggang Li ◽

Yafeng Zhou ◽

Yongtao Wang ◽

Xiaoran Qin ◽

Zhi Tang

Keyword(s):

Character Recognition ◽

Training Data ◽

Imbalanced Training Data

Download Full-text

Modeling of Cu-Au prospectivity in the Carajás mineral province (Brazil) through machine learning: Dealing with imbalanced training data

Ore Geology Reviews ◽

10.1016/j.oregeorev.2020.103611 ◽

2020 ◽

Vol 124 ◽

pp. 103611 ◽

Author(s):

Elias Martins Guerra Prado ◽

Carlos Roberto de Souza Filho ◽

Emmanuel John M. Carranza ◽

João Gabriel Motta

Keyword(s):

Machine Learning ◽

Training Data ◽

Carajás Mineral Province ◽

Imbalanced Training Data

Download Full-text

Deep Learning Case Study on Imbalanced Training Data for Automatic Bird Identification

Deep Learning: Algorithms and Applications - Studies in Computational Intelligence ◽

10.1007/978-3-030-31760-7_8 ◽

2019 ◽

pp. 231-262

Author(s):

Juha Niemi ◽

Juha T. Tanttu

Keyword(s):

Deep Learning ◽

Training Data ◽

Imbalanced Training Data

Download Full-text

Neural Network Classifier-Based OPC With Imbalanced Training Data

IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems ◽

10.1109/tcad.2018.2824255 ◽

2019 ◽

Vol 38 (5) ◽

pp. 938-948 ◽

Author(s):

Suhyeong Choi ◽

Seongbo Shim ◽

Youngsoo Shin

Keyword(s):

Neural Network ◽

Training Data ◽

Neural Network Classifier ◽

Imbalanced Training Data

Download Full-text

The Impact of Imbalanced Training Data on Local Matching Learning of Ontologies

Business Information Systems - Lecture Notes in Business Information Processing ◽

10.1007/978-3-030-20485-3_13 ◽

2019 ◽

pp. 162-175

Author(s):

Amir Laadhar ◽

Faiza Ghozzi ◽

Imen Megdiche ◽

Franck Ravat ◽

Olivier Teste ◽

...

Keyword(s):

Training Data ◽

Imbalanced Training Data ◽

Download Full-text

The impact of imbalanced training data on machine learning for author name disambiguation

Scientometrics ◽

10.1007/s11192-018-2865-9 ◽

2018 ◽

Vol 117 (1) ◽

pp. 511-526 ◽

Author(s):

Jinseok Kim ◽

Jenna Kim

Keyword(s):

Machine Learning ◽

Training Data ◽

Name Disambiguation ◽

Author Name Disambiguation ◽

Imbalanced Training Data ◽

Download Full-text