Transfer Learning for Wireless Fingerprinting Localization Based on Optimal Transport

Wireless fingerprinting localization (FL) systems identify locations by building radio fingerprint maps, aiming to provide satisfactory location solutions for the complex environment. However, the radio map is easy to change, and the cost of building a new one is high. One research focus is to transfer knowledge from the old radio maps to a new one. Feature-based transfer learning methods help by mapping the source fingerprint and the target fingerprint to a common hidden domain, then minimize the maximum mean difference (MMD) distance between the empirical distributions in the latent domain. In this paper, the optimal transport (OT)-based transfer learning is adopted to directly map the fingerprint from the source domain to the target domain by minimizing the Wasserstein distance so that the data distribution of the two domains can be better matched and the positioning performance in the target domain is improved. Two channel-models are used to simulate the transfer scenarios, and the public measured data test further verifies that the transfer learning based on OT has better accuracy and performance when the radio map changes in FL, indicating the importance of the method in this field.

Download Full-text

Optimal Transport with Dimensionality Reduction for Domain Adaptation

Symmetry ◽

10.3390/sym12121994 ◽

2020 ◽

Vol 12 (12) ◽

pp. 1994

Author(s):

Ping Li ◽

Zhiwei Ni ◽

Xuhui Zhu ◽

Juan Song ◽

Wenying Wu

Keyword(s):

Dimensionality Reduction ◽

Optimal Transport ◽

Domain Adaptation ◽

Wasserstein Distance ◽

Local Information ◽

Target Domain ◽

Source Domain ◽

Second Stage ◽

Cross Domain ◽

Feature Based

Domain adaptation manages to learn a robust classifier for target domain, using the source domain, but they often follow different distributions. To bridge distribution shift between the two domains, most of previous works aim to align their feature distributions through feature transformation, of which optimal transport for domain adaptation has attract researchers’ interest, as it can exploit the local information of the two domains in the process of mapping the source instances to the target ones by minimizing Wasserstein distance between their feature distributions. However, it may weaken the feature discriminability of source domain, thus degrade domain adaptation performance. To address this problem, this paper proposes a two-stage feature-based adaptation approach, referred to as optimal transport with dimensionality reduction (OTDR). In the first stage, we apply the dimensionality reduction with intradomain variant maximization but source intraclass compactness minimization, to separate data samples as much as possible and enhance the feature discriminability of the source domain. In the second stage, we leverage optimal transport-based technique to preserve the local information of the two domains. Notably, the desirable properties in the first stage can mitigate the degradation of feature discriminability of the source domain in the second stage. Extensive experiments on several cross-domain image datasets validate that OTDR is superior to its competitors in classification accuracy.

Download Full-text

A non-exponential extension of Sanov’s theorem via convex duality

Advances in Applied Probability ◽

10.1017/apr.2019.52 ◽

2020 ◽

Vol 52 (1) ◽

pp. 61-101

Author(s):

Daniel Lacker

Keyword(s):

Optimal Transport ◽

Large Deviation ◽

Optimization Problems ◽

Risk Measures ◽

Approximate Solutions ◽

Variational Problems ◽

Dual Pair ◽

Wasserstein Distance ◽

Empirical Measures ◽

Empirical Distributions

AbstractThis work is devoted to a vast extension of Sanov’s theorem, in Laplace principle form, based on alternatives to the classical convex dual pair of relative entropy and cumulant generating functional. The abstract results give rise to a number of probabilistic limit theorems and asymptotics. For instance, widely applicable non-exponential large deviation upper bounds are derived for empirical distributions and averages of independent and identically distributed samples under minimal integrability assumptions, notably accommodating heavy-tailed distributions. Other interesting manifestations of the abstract results include new results on the rate of convergence of empirical measures in Wasserstein distance, uniform large deviation bounds, and variational problems involving optimal transport costs, as well as an application to error estimates for approximate solutions of stochastic optimization problems. The proofs build on the Dupuis–Ellis weak convergence approach to large deviations as well as the duality theory for convex risk measures.

Download Full-text

Spacecraft Intelligent Fault Diagnosis under Variable Working Conditions via Wasserstein Distance-Based Deep Adversarial Transfer Learning

International Journal of Aerospace Engineering ◽

10.1155/2021/6099818 ◽

2021 ◽

Vol 2021 ◽

pp. 1-16

Author(s):

Gang Xiang ◽

Kun Tian

Keyword(s):

Fault Diagnosis ◽

Transfer Learning ◽

Working Conditions ◽

Manufacturing Industry ◽

Probability Distributions ◽

Wasserstein Distance ◽

Target Domain ◽

Target Feature ◽

Generative Adversarial Network ◽

Adversarial Network

In recent years, deep learning methods which promote the accuracy and efficiency of fault diagnosis task without any extra requirement of artificial feature extraction have elicited the attention of researchers in the field of manufacturing industry as well as aerospace. However, the problems that data in source and target domains usually have different probability distributions because of different working conditions and there are insufficient labeled or even unlabeled data in target domain significantly deteriorate the performance and generalization of deep fault diagnosis models. To address these problems, we propose a novel Wasserstein Generative Adversarial Network with Gradient Penalty- (WGAN-GP-) based deep adversarial transfer learning (WDATL) model in this study, which exploits a domain critic to learn domain invariant feature representations by minimizing the Wasserstein distance between the source and target feature distributions through adversarial training. Moreover, an improved one-dimensional convolutional neural network- (CNN-) based feature extractor which utilizes exponential linear units (ELU) as activation functions and wide kernels is designed to automatically extract the latent features of raw time-series input data. Then, the fault model classifier trained in one working condition (source domain) with sufficient labeled samples could be generalized to diagnose data in other working conditions (target domain) with insufficient labeled samples. Experiments on two open datasets demonstrate that our proposed WDATL model outperforms most of the state-of-the-art approaches on transfer diagnosis tasks under diverse working circumstances.

Download Full-text

Differentially Private Optimal Transport: Application to Domain Adaptation

Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2019/395 ◽

2019 ◽

Author(s):

Nam LeTien ◽

Amaury Habrard ◽

Marc Sebban

Keyword(s):

Optimal Transport ◽

Transport Model ◽

Domain Adaptation ◽

Wasserstein Distance ◽

Source Distribution ◽

Target Domain ◽

The Past ◽

Series Of Experiments ◽

Adaptation Scenarios ◽

Transportation Plan

Optimal transport has received much attention during the past few years to deal with domain adaptation tasks. The goal is to transfer knowledge from a source domain to a target domain by finding a transportation of minimal cost moving the source distribution to the target one. In this paper, we address the challenging task of privacy preserving domain adaptation by optimal transport. Using the Johnson-Lindenstrauss transform together with some noise, we present the first differentially private optimal transport model and show how it can be directly applied on both unsupervised and semi-supervised domain adaptation scenarios. Our theoretically grounded method allows the optimization of the transportation plan and the Wasserstein distance between the two distributions while protecting the data of both domains.We perform an extensive series of experiments on various benchmarks (VisDA, Office-Home and Office-Caltech datasets) that demonstrates the efficiency of our method compared to non-private strategies.

Download Full-text

Feature-Based Transfer Learning for Bearing Fault Recognition Without Available Fault Data

2020 International Symposium on Flexible Automation ◽

10.1115/isfa2020-9636 ◽

2020 ◽

Author(s):

Clayton Cooper ◽

Dongdong Liu ◽

Jianjing Zhang ◽

Robert X. Gao

Keyword(s):

Transfer Learning ◽

Domain Adaptation ◽

Operating Conditions ◽

Data Availability ◽

Detection Methods ◽

Target Domain ◽

Bearing Fault ◽

Fault Recognition ◽

Training Samples ◽

Feature Based

Abstract Machine learning has demonstrated its effectiveness in fault recognition for mechanical systems. However, sufficient data for establishing accurate and reliable fault detection methods is not always available in real-world applications. Transfer learning leverages the knowledge learned from a source domain in order to bypass limitations in data availability and facilitate effective analysis in a target domain. For mechanical fault recognition, existing transfer learning methods mainly focus on transferring knowledge between different operating conditions which require training samples corresponding to all desired fault conditions from the target domain in order to realize domain adaptation. However faulted data in real applications is usually unavailable and impractical to collect. In this paper, a transfer learning-based cross-machine bearing fault recognition method is investigated. This new method sees domain adaptation take place without faulted data being available in the target domain, and thus alleviates data availability limitations. The effectiveness of the method is demonstrated in a case study in which the bearing diagnostic method is transferred from an electric motor to a wind turbine.

Download Full-text

Optimal transport for variational data assimilation

Nonlinear Processes in Geophysics ◽

10.5194/npg-25-55-2018 ◽

2018 ◽

Vol 25 (1) ◽

pp. 55-66 ◽

Cited By ~ 2

Author(s):

Nelson Feyeux ◽

Arthur Vidard ◽

Maëlle Nodet

Keyword(s):

Data Assimilation ◽

Transport Theory ◽

Optimal Transport ◽

Wasserstein Distance ◽

Variational Data Assimilation ◽

Water Model ◽

Position Errors ◽

Optimal Transport Theory ◽

The Cost ◽

Model Trajectory

Abstract. Usually data assimilation methods evaluate observation-model misfits using weighted L2 distances. However, it is not well suited when observed features are present in the model with position error. In this context, the Wasserstein distance stemming from optimal transport theory is more relevant.This paper proposes the adaptation of variational data assimilation for the use of such a measure. It provides a short introduction of optimal transport theory and discusses the importance of a proper choice of scalar product to compute the cost function gradient. It also extends the discussion to the way the descent is performed within the minimization process.These algorithmic changes are tested on a nonlinear shallow-water model, leading to the conclusion that optimal transport-based data assimilation seems to be promising to capture position errors in the model trajectory.

Download Full-text

PF-TL: Payload Feature-Based Transfer Learning for Dealing with the Lack of Training Data

Electronics ◽

10.3390/electronics10101148 ◽

2021 ◽

Vol 10 (10) ◽

pp. 1148

Author(s):

Ilok Jung ◽

Jongin Lim ◽

Huykang Kim

Keyword(s):

Machine Learning ◽

Intrusion Detection ◽

Transfer Learning ◽

Cyber Security ◽

Optimization Method ◽

Training Data ◽

Target Domain ◽

Feature Extraction Method ◽

Feature Based ◽

Hybrid Feature Extraction

The number of studies on applying machine learning to cyber security has increased over the past few years. These studies, however, are facing difficulties with making themselves usable in the real world, mainly due to the lack of training data and reusability of a created model. While transfer learning seems like a solution to these problems, the number of studies in the field of intrusion detection is still insufficient. Therefore, this study proposes payload feature-based transfer learning as a solution to the lack of training data when applying machine learning to intrusion detection by using the knowledge from an already known domain. Firstly, it expands the extracting range of information from header to payload to accurately deliver the information by using an effective hybrid feature extraction method. Secondly, this study provides an improved optimization method for the extracted features to create a labeled dataset for a target domain. This proposal was validated on publicly available datasets, using three distinctive scenarios, and the results confirmed its usability in practice by increasing the accuracy of the training data created from the transfer learning by 30%, compared to that of the non-transfer learning method. In addition, we showed that this approach can help in identifying previously unknown attacks and reusing models from different domains.

Download Full-text

From shallow to deep: exploiting feature-based classifiers for domain adaptation in semantic segmentation

10.1101/2021.11.09.467925 ◽

2021 ◽

Author(s):

Alex Matskevych ◽

Adrian Wolny ◽

Constantin Pape ◽

Anna Kreshuk

Keyword(s):

Random Forest ◽

Domain Adaptation ◽

Semantic Segmentation ◽

Training Data ◽

Target Domain ◽

Segmentation Accuracy ◽

Challenging Tasks ◽

Feature Based ◽

The Cost ◽

Nuclear Segmentation

The remarkable performance of Convolutional Neural Networks on image segmentation tasks comes at the cost of a large amount of pixelwise annotated images that have to be segmented for training. In contrast, feature-based learning methods, such as the Random Forest, require little training data, but never reach the segmentation accuracy of CNNs. This work bridges the two approaches in a transfer learning setting. We show that a CNN can be trained to correct the errors of the Random Forest in the source domain and then be applied to correct such errors in the target domain without retraining, as the domain shift between the Random Forest predictions is much smaller than between the raw data. By leveraging a few brushstrokes as annotations in the target domain, the method can deliver segmentations that are sufficiently accurate to act as pseudo-labels for target-domain CNN training. We demonstrate the performance of the method on several datasets with the challenging tasks of mitochondria, membrane and nuclear segmentation. It yields excellent performance compared to microscopy domain adaptation baselines, especially when a significant domain shift is involved.

Download Full-text

Stratification and Optimal Resampling for Sequential Monte Carlo

Biometrika ◽

10.1093/biomet/asab004 ◽

2021 ◽

Author(s):

Yichao Li ◽

Wenshuo Wang ◽

Ke Deng ◽

Jun S Liu

Keyword(s):

Monte Carlo ◽

Sequential Monte Carlo ◽

Optimal Transport ◽

Wasserstein Distance ◽

Hilbert Curve ◽

Quasi Monte Carlo ◽

Empirical Distributions ◽

Monte Carlo Algorithms ◽

Energy Distance ◽

Optimal Resampling

Abstract Sequential Monte Carlo algorithms have been widely accepted as a powerful computational tool for making inference with dynamical systems. A key step in sequential Monte Carlo is resampling, which plays a role of steering the algorithm towards the future dynamics. Several strategies have been used in practice, including multinomial resampling, residual resampling, optimal resampling, stratified resampling, and optimal transport resampling. In the one-dimensional cases, we show that optimal transport resampling is equivalent to stratified resampling on the sorted particles, and they both minimize the resampling variance as well as the expected squared energy distance between the original and resampled empirical distributions. In general d-dimensional cases, if the particles are first sorted using the Hilbert curve, we show that the variance of stratified resampling is O(m-(1+2/d)), an improved rate compared to the previously known best rate O(m-(1+1/d)), where m is the number of resampled particles. We show this improved rate is optimal for ordered stratified resampling schemes, as conjectured in Gerber et al. (2019).We also present an almost sure bound on the Wasserstein distance between the original and Hilbert-curve-resampled empirical distributions. In light of these results, we show that, for dimension d > 1, the mean square error of sequential quasi-Monte Carlo with n particles can be O(n-1-4/{d(d+4)}) if Hilbert curve resampling is used and a specific low-discrepancy set is chosen. To our knowledge, this is the first known convergence rate lower than o(n-1).

Download Full-text

A comparison of the cost of research in South Africa's public research and higher education institutions

South African Journal of Science ◽

10.1590/s0038-23532008000600013 ◽

2008 ◽

Vol 104 (11/12) ◽

Author(s):

D.R. Walwyn

Keyword(s):

Higher Education ◽

Higher Education Institutions ◽

Public Research ◽

Pricing Policy ◽

The Public ◽

Annual Salary ◽

Research Sector ◽

Salary Survey ◽

The Cost ◽

Comparative Information

Despite the importance of labour and overhead costs to both funders and performers of research in South Africa, there is little published information on the remuneration structures for researchers, technician and research support staff. Moreover, there are widely different pricing practices and perceptions within the public research and higher education institutions, which in some cases do not reflect the underlying costs to the institution or the inherent value of the research. In this article, data from the 2004/5 Research and Development Survey have been used to generate comparative information on the cost of research in various performance sectors. It is shown that this cost is lowest in the higher education institutions, and highest in the business sector, although the differences in direct labour and overheads are not as large as may have been expected. The calculated cost of research is then compared with the gazetted rates for engineers, scientists and auditors performing work on behalf of the public sector, which in all cases are higher than the research sector. This analysis emphasizes the need within the public research and higher education institutions for the development of a common pricing policy and for an annual salary survey, in order to dispel some of the myths around the relative costs of research, the relative levels of overhead ratios and the apparent disparity in remuneration levels.

Download Full-text