A Classification Learning Research based on Discriminative Knowledge-Leverage Transfer

2018 ◽  
Vol 9 (4) ◽  
pp. 52-68 ◽  
Author(s):  
Ding Xiong ◽  
Lu Yan

Current transfer learning models study the source data for future target inferences within a major view, the whole source data should be used to explore the shared knowledge structure. However, human resources are constrained, the source domain data is collected as a whole in the real scene. However, this is not realistic, this data is associated with the target domain. A generalized empirical risk minimization model (GERM) is proposed in this article with discriminative knowledge-leverage (KL). The empirical risk minimization (ERM) principle is extended to the transfer learning setting. The theoretical upper bound of generalized ERM (GERM) is given for the practical discriminative transfer learning. The subset of the source domain data can be automatically selected in the model, and the source domain data is associated with the target domain. It can solve with only some knowledge of the source domain being available, thus it can avoid the negative transfer effect which is caused by the whole source domain dataset in the real scene. Simulation results show that the proposed algorithm is better than the traditional transfer learning algorithm in simulation data sets and real data sets.

2020 ◽  
Vol 10 (21) ◽  
pp. 7768
Author(s):  
Seong Hee Cho ◽  
Seokgoo Kim ◽  
Joo-Ho Choi

In the fault diagnosis study, data deficiency, meaning that the fault data for the training are scarce, is often encountered, and it may deteriorate the performance of the fault diagnosis greatly. To solve this issue, the transfer learning (TL) approach is employed to exploit the neural network (NN) trained in another (source) domain where enough fault data are available in order to improve the NN performance of the real (target) domain. While there have been similar attempts of TL in the literature to solve the imbalance issue, they were about the sample imbalance between the source and target domain, whereas the present study considers the imbalance between the normal and fault data. To illustrate this, normal and fault datasets are acquired from the linear motion guide, in which the data at high and low speeds represent the real operation (target) and maintenance inspection (source), respectively. The effect of data deficiency is studied by reducing the number of fault data in the target domain, and comparing the performance of TL, which exploits the knowledge of the source domain and the ordinary machine learning (ML) approach without it. By examining the accuracy of the fault diagnosis as a function of imbalance ratio, it is found that the lower bound and interquartile range (IQR) of the accuracy are improved greatly by employing the TL approach. Therefore, it can be concluded that TL is truly more effective than the ordinary ML when there is a large imbalance between the fault and normal data, such as smaller than 0.1.


2020 ◽  
Vol 2020 ◽  
pp. 1-12
Author(s):  
Zhenyu Lu ◽  
Cheng Zheng ◽  
Tingya Yang

Visibility forecasting in offshore areas faces the problems of low observational data and complex weather. This paper proposes an intelligent prediction method of offshore visibility based on temporal convolutional network (TCN) and transfer learning to solve the problem. First, preprocess the visibility data sets of the source and target domains to improve the quality of the data. Then, build a model based on temporal convolutional network and transfer learning (TCN_TL) to learn the visibility data of the source domain. Finally, after transferring the knowledge learned from a large amount of data in the source domain, the model learns the small data set in the target domain. After completing the training, the model data of the European Mid-Range Weather Forecast Center (ECMWF) meteorological field were selected to test the model performance. The method proposed in this paper has achieved relatively good results in the visibility forecast of Qiongzhou Strait. Taking Haikou Station in the spring and winter of 2018 as an example, the forecast error is significantly lower than that before the transfer learning, and the forecast score is increased by 0.11 within the 0-1 km level and the 24 h forecast period. Compared with the CUACE forecast results, the forecast error of TCN_TL is smaller than that of the former, and the TS score is improved by 0.16. The results show that under the condition of small data sets, transfer learning improves the prediction performance of the model, and TCN_TL performs better than other deep learning methods and CUACE.


2016 ◽  
Vol 2016 ◽  
pp. 1-8
Author(s):  
Juan Meng ◽  
Guyu Hu ◽  
Dong Li ◽  
Yanyan Zhang ◽  
Zhisong Pan

Domain adaptation has received much attention as a major form of transfer learning. One issue that should be considered in domain adaptation is the gap between source domain and target domain. In order to improve the generalization ability of domain adaption methods, we proposed a framework for domain adaptation combining source and target data, with a new regularizer which takes generalization bounds into account. This regularization term considers integral probability metric (IPM) as the distance between the source domain and the target domain and thus can bound up the testing error of an existing predictor from the formula. Since the computation of IPM only involves two distributions, this generalization term is independent with specific classifiers. With popular learning models, the empirical risk minimization is expressed as a general convex optimization problem and thus can be solved effectively by existing tools. Empirical studies on synthetic data for regression and real-world data for classification show the effectiveness of this method.


2015 ◽  
Vol 2015 ◽  
pp. 1-8
Author(s):  
Mingchen Yao ◽  
Chao Zhang ◽  
Wei Wu

Many generalization results in learning theory are established under the assumption that samples are independent and identically distributed (i.i.d.). However, numerous learning tasks in practical applications involve the time-dependent data. In this paper, we propose a theoretical framework to analyze the generalization performance of the empirical risk minimization (ERM) principle for sequences of time-dependent samples (TDS). In particular, we first present the generalization bound of ERM principle for TDS. By introducing some auxiliary quantities, we also give a further analysis of the generalization properties and the asymptotical behaviors of ERM principle for TDS.


2021 ◽  
Author(s):  
Puyu Wang ◽  
Zhenhuan Yang ◽  
Yunwen Lei ◽  
Yiming Ying ◽  
Hai Zhang

Author(s):  
Zhengling Qi ◽  
Ying Cui ◽  
Yufeng Liu ◽  
Jong-Shi Pang

This paper has two main goals: (a) establish several statistical properties—consistency, asymptotic distributions, and convergence rates—of stationary solutions and values of a class of coupled nonconvex and nonsmooth empirical risk-minimization problems and (b) validate these properties by a noisy amplitude-based phase-retrieval problem, the latter being of much topical interest. Derived from available data via sampling, these empirical risk-minimization problems are the computational workhorse of a population risk model that involves the minimization of an expected value of a random functional. When these minimization problems are nonconvex, the computation of their globally optimal solutions is elusive. Together with the fact that the expectation operator cannot be evaluated for general probability distributions, it becomes necessary to justify whether the stationary solutions of the empirical problems are practical approximations of the stationary solution of the population problem. When these two features, general distribution and nonconvexity, are coupled with nondifferentiability that often renders the problems “non-Clarke regular,” the task of the justification becomes challenging. Our work aims to address such a challenge within an algorithm-free setting. The resulting analysis is, therefore, different from much of the analysis in the recent literature that is based on local search algorithms. Furthermore, supplementing the classical global minimizer-centric analysis, our results offer a promising step to close the gap between computational optimization and asymptotic analysis of coupled, nonconvex, nonsmooth statistical estimation problems, expanding the former with statistical properties of the practically obtained solution and providing the latter with a more practical focus pertaining to computational tractability.


2016 ◽  
Vol 28 (12) ◽  
pp. 2853-2889 ◽  
Author(s):  
Hanyuan Hang ◽  
Yunlong Feng ◽  
Ingo Steinwart ◽  
Johan A. K. Suykens

This letter investigates the supervised learning problem with observations drawn from certain general stationary stochastic processes. Here by general, we mean that many stationary stochastic processes can be included. We show that when the stochastic processes satisfy a generalized Bernstein-type inequality, a unified treatment on analyzing the learning schemes with various mixing processes can be conducted and a sharp oracle inequality for generic regularized empirical risk minimization schemes can be established. The obtained oracle inequality is then applied to derive convergence rates for several learning schemes such as empirical risk minimization (ERM), least squares support vector machines (LS-SVMs) using given generic kernels, and SVMs using gaussian kernels for both least squares and quantile regression. It turns out that for independent and identically distributed (i.i.d.) processes, our learning rates for ERM recover the optimal rates. For non-i.i.d. processes, including geometrically [Formula: see text]-mixing Markov processes, geometrically [Formula: see text]-mixing processes with restricted decay, [Formula: see text]-mixing processes, and (time-reversed) geometrically [Formula: see text]-mixing processes, our learning rates for SVMs with gaussian kernels match, up to some arbitrarily small extra term in the exponent, the optimal rates. For the remaining cases, our rates are at least close to the optimal rates. As a by-product, the assumed generalized Bernstein-type inequality also provides an interpretation of the so-called effective number of observations for various mixing processes.


Sign in / Sign up

Export Citation Format

Share Document