data imbalance
Recently Published Documents


TOTAL DOCUMENTS

152
(FIVE YEARS 114)

H-INDEX

10
(FIVE YEARS 4)

Author(s):  
Tilman Krokotsch ◽  
Mirko Knaak ◽  
Clemens G¨uhmann

RUL estimation plays a vital role in effectively scheduling maintenance operations. Unfortunately, it suffers from a severe data imbalance where data from machines near their end of life is rare. Additionally, the data produced by a machine can only be labeled after the machine failed. Both of these points make using data-driven methods for RUL estimation difficult. Semi-Supervised Learning (SSL) can incorporate the unlabeled data produced by machines that did not yet fail into data-driven methods. Previous work on SSL evaluated approaches under unrealistic conditions where the data near failure was still available. Even so, only moderate improvements were made. This paper defines more realistic evaluation conditions and proposes a novel SSL approach based on self-supervised pre-training. The method can outperform two competing approaches from the literature and the supervised baseline on the NASA Commercial Modular Aero-Propulsion System Simulation dataset.


2022 ◽  
Author(s):  
Erqiang Deng ◽  
Zhiguang Qin ◽  
Dajiang Chen ◽  
Zhen Qin ◽  
Yi Ding ◽  
...  

Abstract Deep learning has been widely used in medical image segmentation, although the accuracy is affected by the problems of small sample space, data imbalance, and cross-device differences. Aiming at such issues, a enhancement GAN network is proposed by using the domain transferring of the adversarial generation network to enhance the original medical images. Specifically, based on retaining the transferability of the original GAN network, a new optimizer is added to generate a sample space with a continuous distribution, which can be used as the target domain of the original image transferring. The optimizer back-propagates the labels of the supervised data set through the segmentation network and maps the discrete distribution of the labels to the continuous image distribution, which has a high similarity to the original image but improves the segmentation efficiency.On this basis, the optimized distribution is taken as the target domain, and the generator and discriminator of the GAN network are trained so that the generator can transfer the original image distribution to the target distribution. extensive experiments are conducted based on MRI, CT, and ultrasound data sets. The experimental results show that, the proposed method has a good generalization effect in medical image segmentation, even when the data set has limited sample space and data imbalance to a certain extent.


2021 ◽  
Vol 2021 ◽  
pp. 1-14
Author(s):  
Lei Wang ◽  
Qian Li ◽  
Jin Qin

Error diagnosis and detection have become important in modern production due to the importance of spinning equipment. Artificial neural network pattern recognition methods are widely utilized in rotating equipment fault detection. These methods often need a large quantity of sample data to train the model; however, sample data (especially fault samples) are uncommon in engineering. Preliminary work focuses on dimensionality reduction for big data sets using semisupervised methods. The rotary machine’s polar coordinate signal is used to build a GAN network structure. ANN and tiny samples are utilized to identify DCGAN model flaws. The time-conditional generative adversarial network is proposed for one-dimensional vibration signal defect identification under data imbalance. Finally, auxiliary samples are gathered under similar conditions, and CCNs learn about target sample characteristics. Convolutional neural networks handle the problem of defect identification with small samples in different ways. In high-dimensional data sets with nonlinearities, low fault type recognition rates and fewer marked fault samples may be addressed using kernel semisupervised local Fisher discriminant analysis. The SELF method is used to build the optimum projection transformation matrix from the data set. The KNN classifier then learns low-dimensional features and detects an error kind. Because DCGAN training is unstable and the results are incorrect, an improved deep convolutional generative adversarial network (IDCGAN) is proposed. The tests indicate that the IDCGAN generates more real samples and solves the problem of defect identification in small samples. Time-conditional generation adversarial network data improvement lowers fault diagnosis effort and deep learning model complexity. The TCGAN and CNN are combined to provide superior fault detection under data imbalance. Modeling and experiments demonstrate TCGAN’s use and superiority.


2021 ◽  
Vol 2021 ◽  
pp. 1-19
Author(s):  
Yi Liu ◽  
Qi Chang ◽  
Jiaxin Luo ◽  
LinLi ◽  
Junfeng Man ◽  
...  

Under different transportation protection, the sample data of bogie traction motor bearings of urban rail vehicles are seriously unbalanced, and the fault diagnosis ability and generalization effect are poor, which makes it difficult to evaluate the protection effect of bearings effectively. In this paper, a multimeasure hybrid evaluation model based on compressed sensing is proposed to evaluate the effect of bearing transportation protection under data imbalance. Firstly, bearing vibration signals under different transport protection conditions were compressed and sampled, and the original high-Witt collection in time domain, frequency domain, and time-frequency domain was extracted. Then, a multimeasure mixed feature evaluation model of correlation, distance, and signal was constructed, and the optimal multimeasure combination strategy was optimized by using comprehensive sensitivity score evaluation index. Finally, an evaluation model of bearing protection effect based on unified feature index was constructed by using the best feature subset evaluated, and the unified indicator was quantified to characterize the protection effect of different protection states. The experimental results show that the model can effectively evaluate bearings under different transport protection.


2021 ◽  
Author(s):  
Bingshu Wang ◽  
Lanfan Jiang ◽  
Wenxing Zhu ◽  
Longkun Guo ◽  
Jianli Chen ◽  
...  

2021 ◽  
Author(s):  
Shenyang Chen ◽  
QingXiong Tan ◽  
JingChen Li ◽  
Yu Li

Signal peptide is a short peptide located in the N-terminus of proteins. It plays an important role in targeting and transferring transmembrane proteins and secreted proteins to correct positions. Compared with traditional experimental methods to identify and discover signal peptides,the computational methods are faster and more efficient, which are more practical for the analysis of thousands or even millions of protein sequences in reality, especially for the metagenomic data. Therefore, computational tools are recently proposed to classify signal peptides and predict cleavage site positions, but most of them disregard the extreme data imbalance problem in these tasks. In addition, almost all these methods rely on additional group information of proteins to boost their performances, which, however, may not always be available. To deal with these issues, in this paper, we present Unbiased Organism-agnostic Signal Peptide Network(USPNet), a signal peptide prediction and cleavage site prediction model based on deep protein language model. We propose to use label distribution-aware margin (LDAM) loss and evolutionary scale modeling (ESM) embedding to handle data imbalance and object-dependence problems. Extensive experimental results demonstrate that the proposed method significantly outperforms all the previous methods on the classification performance. Additional study on the simulated metagenomic data further indicates that our model is a more universal and robust tool without dependency on additional group information of proteins, with the Matthews correlation coefficient improved by up to 17.5‰. The proposed method will be potentially useful to discover new signal peptides from the abundant metagenomic data.


Sign in / Sign up

Export Citation Format

Share Document