target domain
Recently Published Documents


TOTAL DOCUMENTS

642
(FIVE YEARS 530)

H-INDEX

22
(FIVE YEARS 11)

2022 ◽  
Vol 16 (4) ◽  
pp. 1-25
Author(s):  
Hanrui Wu ◽  
Michael K. Ng

Multi-source domain adaptation is a challenging topic in transfer learning, especially when the data of each domain are represented by different kinds of features, i.e., Multi-source Heterogeneous Domain Adaptation (MHDA). It is important to take advantage of the knowledge extracted from multiple sources as well as bridge the heterogeneous spaces for handling the MHDA paradigm. This article proposes a novel method named Multiple Graphs and Low-rank Embedding (MGLE), which models the local structure information of multiple domains using multiple graphs and learns the low-rank embedding of the target domain. Then, MGLE augments the learned embedding with the original target data. Specifically, we introduce the modules of both domain discrepancy and domain relevance into the multiple graphs and low-rank embedding learning procedure. Subsequently, we develop an iterative optimization algorithm to solve the resulting problem. We evaluate the effectiveness of the proposed method on several real-world datasets. Promising results show that the performance of MGLE is better than that of the baseline methods in terms of several metrics, such as AUC, MAE, accuracy, precision, F1 score, and MCC, demonstrating the effectiveness of the proposed method.


2022 ◽  
Vol 16 (4) ◽  
pp. 1-30
Author(s):  
Muhammad Abulaish ◽  
Mohd Fazil ◽  
Mohammed J. Zaki

Domain-specific keyword extraction is a vital task in the field of text mining. There are various research tasks, such as spam e-mail classification, abusive language detection, sentiment analysis, and emotion mining, where a set of domain-specific keywords (aka lexicon) is highly effective. Existing works for keyword extraction list all keywords rather than domain-specific keywords from a document corpus. Moreover, most of the existing approaches perform well on formal document corpuses but fail on noisy and informal user-generated content in online social media. In this article, we present a hybrid approach by jointly modeling the local and global contextual semantics of words, utilizing the strength of distributional word representation and contrasting-domain corpus for domain-specific keyword extraction. Starting with a seed set of a few domain-specific keywords, we model the text corpus as a weighted word-graph. In this graph, the initial weight of a node (word) represents its semantic association with the target domain calculated as a linear combination of three semantic association metrics, and the weight of an edge connecting a pair of nodes represents the co-occurrence count of the respective words. Thereafter, a modified PageRank method is applied to the word-graph to identify the most relevant words for expanding the initial set of domain-specific keywords. We evaluate our method over both formal and informal text corpuses (comprising six datasets), and show that it performs significantly better in comparison to state-of-the-art methods. Furthermore, we generalize our approach to handle the language-agnostic case, and show that it outperforms existing language-agnostic approaches.


2022 ◽  
Vol 13 (1) ◽  
pp. 1-14
Author(s):  
Shuteng Niu ◽  
Yushan Jiang ◽  
Bowen Chen ◽  
Jian Wang ◽  
Yongxin Liu ◽  
...  

In the past decades, information from all kinds of data has been on a rapid increase. With state-of-the-art performance, machine learning algorithms have been beneficial for information management. However, insufficient supervised training data is still an adversity in many real-world applications. Therefore, transfer learning (TF) was proposed to address this issue. This article studies a not well investigated but important TL problem termed cross-modality transfer learning (CMTL). This topic is closely related to distant domain transfer learning (DDTL) and negative transfer. In general, conventional TL disciplines assume that the source domain and the target domain are in the same modality. DDTL aims to make efficient transfers even when the domains or the tasks are entirely different. As an extension of DDTL, CMTL aims to make efficient transfers between two different data modalities, such as from image to text. As the main focus of this study, we aim to improve the performance of image classification by transferring knowledge from text data. Previously, a few CMTL algorithms were proposed to deal with image classification problems. However, most existing algorithms are very task specific, and they are unstable on convergence. There are four main contributions in this study. First, we propose a novel heterogeneous CMTL algorithm, which requires only a tiny set of unlabeled target data and labeled source data with associate text tags. Second, we introduce a latent semantic information extraction method to connect the information learned from the image data and the text data. Third, the proposed method can effectively handle the information transfer across different modalities (text-image). Fourth, we examined our algorithm on a public dataset, Office-31. It has achieved up to 5% higher classification accuracy than “non-transfer” algorithms and up to 9% higher than existing CMTL algorithms.


2022 ◽  
Vol 40 (1) ◽  
pp. 1-29
Author(s):  
Hanrui Wu ◽  
Qingyao Wu ◽  
Michael K. Ng

Domain adaptation aims at improving the performance of learning tasks in a target domain by leveraging the knowledge extracted from a source domain. To this end, one can perform knowledge transfer between these two domains. However, this problem becomes extremely challenging when the data of these two domains are characterized by different types of features, i.e., the feature spaces of the source and target domains are different, which is referred to as heterogeneous domain adaptation (HDA). To solve this problem, we propose a novel model called Knowledge Preserving and Distribution Alignment (KPDA), which learns an augmented target space by jointly minimizing information loss and maximizing domain distribution alignment. Specifically, we seek to discover a latent space, where the knowledge is preserved by exploiting the Laplacian graph terms and reconstruction regularizations. Moreover, we adopt the Maximum Mean Discrepancy to align the distributions of the source and target domains in the latent space. Mathematically, KPDA is formulated as a minimization problem with orthogonal constraints, which involves two projection variables. Then, we develop an algorithm based on the Gauss–Seidel iteration scheme and split the problem into two subproblems, which are solved by searching algorithms based on the Barzilai–Borwein (BB) stepsize. Promising results demonstrate the effectiveness of the proposed method.


Author(s):  
Huiping Guo ◽  
Hongru Li

AbstractDecomposition hybrid algorithms with the recursive framework which recursively decompose the structural task into structural subtasks to reduce computational complexity are employed to learn Bayesian network (BN) structure. Merging rules are commonly adopted as the combination method in the combination step. The direction determination rule of merging rules has problems in using the idea of keeping v-structures unchanged before and after combination to determine directions of edges in the whole structure. It breaks down in one case due to appearances of wrong v-structures, and is hard to operate in practice. Therefore, we adopt a novel approach for direction determination and propose a two-stage combination method. In the first-stage combination method, we determine nodes, links of edges by merging rules and adopt the idea of permutation and combination to determine directions of contradictory edges. In the second-stage combination method, we restrict edges between nodes that do not satisfy the decomposition property and their parent nodes by determining the target domain according to the decomposition property. Simulation experiments on four networks show that the proposed algorithm can obtain BN structure with higher accuracy compared with other algorithms. Finally, the proposed algorithm is applied to the thickening process of gold hydrometallurgy to solve the practical problem.


2022 ◽  
Vol 4 (1) ◽  
pp. 22-41
Author(s):  
Nermeen Abou Baker ◽  
Nico Zengeler ◽  
Uwe Handmann

Transfer learning is a machine learning technique that uses previously acquired knowledge from a source domain to enhance learning in a target domain by reusing learned weights. This technique is ubiquitous because of its great advantages in achieving high performance while saving training time, memory, and effort in network design. In this paper, we investigate how to select the best pre-trained model that meets the target domain requirements for image classification tasks. In our study, we refined the output layers and general network parameters to apply the knowledge of eleven image processing models, pre-trained on ImageNet, to five different target domain datasets. We measured the accuracy, accuracy density, training time, and model size to evaluate the pre-trained models both in training sessions in one episode and with ten episodes.


Semantic Web ◽  
2022 ◽  
pp. 1-8
Author(s):  
Robert Forkel ◽  
Harald Hammarström

Glottocodes constitute the backbone identification system for the language, dialect and family inventory Glottolog (https://glottolog.org). In this paper, we summarize the motivation and history behind the system of glottocodes and describe the principles and practices of data curation, technical infrastructure and update/version-tracking systematics. Since our understanding of the target domain – the dialects, languages and language families of the entire world – is continually evolving, changes and updates are relatively common. The resulting data is assessed in terms of the FAIR (Findable, Accessible, Interoperable, Reusable) Guiding Principles for scientific data management and stewardship. As such the glottocode-system responds to an important challenge in the realm of Linguistic Linked Data with numerous NLP applications.


Author(s):  
Ziliang Cai ◽  
Lingyue Wang ◽  
Miaomiao Guo ◽  
Guizhi Xu ◽  
Lei Guo ◽  
...  

Emotion plays a significant role in human daily activities, and it can be effectively recognized from EEG signals. However, individual variability limits the generalization of emotion classifiers across subjects. Domain adaptation (DA) is a reliable method to solve the issue. Due to the nonstationarity of EEG, the inferior-quality source domain data bring negative transfer in DA procedures. To solve this problem, an auto-augmentation joint distribution adaptation (AA-JDA) method and a burden-lightened and source-preferred JDA (BLSP-JDA) approach are proposed in this paper. The methods are based on a novel transfer idea, learning the specific knowledge of the target domain from the samples that are appropriate for transfer, which reduces the difficulty of transfer between two domains. On multiple emotion databases, our model shows state-of-the-art performance.


2022 ◽  
Author(s):  
Erqiang Deng ◽  
Zhiguang Qin ◽  
Dajiang Chen ◽  
Zhen Qin ◽  
Yi Ding ◽  
...  

Abstract Deep learning has been widely used in medical image segmentation, although the accuracy is affected by the problems of small sample space, data imbalance, and cross-device differences. Aiming at such issues, a enhancement GAN network is proposed by using the domain transferring of the adversarial generation network to enhance the original medical images. Specifically, based on retaining the transferability of the original GAN network, a new optimizer is added to generate a sample space with a continuous distribution, which can be used as the target domain of the original image transferring. The optimizer back-propagates the labels of the supervised data set through the segmentation network and maps the discrete distribution of the labels to the continuous image distribution, which has a high similarity to the original image but improves the segmentation efficiency.On this basis, the optimized distribution is taken as the target domain, and the generator and discriminator of the GAN network are trained so that the generator can transfer the original image distribution to the target distribution. extensive experiments are conducted based on MRI, CT, and ultrasound data sets. The experimental results show that, the proposed method has a good generalization effect in medical image segmentation, even when the data set has limited sample space and data imbalance to a certain extent.


2022 ◽  
Vol 35 (1) ◽  
Author(s):  
Yunhong Che ◽  
Zhongwei Deng ◽  
Xiaolin Tang ◽  
Xianke Lin ◽  
Xianghong Nie ◽  
...  

AbstractAging diagnosis of batteries is essential to ensure that the energy storage systems operate within a safe region. This paper proposes a novel cell to pack health and lifetime prognostics method based on the combination of transferred deep learning and Gaussian process regression. General health indicators are extracted from the partial discharge process. The sequential degradation model of the health indicator is developed based on a deep learning framework and is migrated for the battery pack degradation prediction. The future degraded capacities of both battery pack and each battery cell are probabilistically predicted to provide a comprehensive lifetime prognostic. Besides, only a few separate battery cells in the source domain and early data of battery packs in the target domain are needed for model construction. Experimental results show that the lifetime prediction errors are less than 25 cycles for the battery pack, even with only 50 cycles for model fine-tuning, which can save about 90% time for the aging experiment. Thus, it largely reduces the time and labor for battery pack investigation. The predicted capacity trends of the battery cells connected in the battery pack accurately reflect the actual degradation of each battery cell, which can reveal the weakest cell for maintenance in advance.


Sign in / Sign up

Export Citation Format

Share Document