domain transfer
Recently Published Documents





Shu Jiang ◽  
Zuchao Li ◽  
Hai Zhao ◽  
Bao-Liang Lu ◽  
Rui Wang

In recent years, the research on dependency parsing focuses on improving the accuracy of the domain-specific (in-domain) test datasets and has made remarkable progress. However, there are innumerable scenarios in the real world that are not covered by the dataset, namely, the out-of-domain dataset. As a result, parsers that perform well on the in-domain data usually suffer from significant performance degradation on the out-of-domain data. Therefore, to adapt the existing in-domain parsers with high performance to a new domain scenario, cross-domain transfer learning methods are essential to solve the domain problem in parsing. This paper examines two scenarios for cross-domain transfer learning: semi-supervised and unsupervised cross-domain transfer learning. Specifically, we adopt a pre-trained language model BERT for training on the source domain (in-domain) data at the subword level and introduce self-training methods varied from tri-training for these two scenarios. The evaluation results on the NLPCC-2019 shared task and universal dependency parsing task indicate the effectiveness of the adopted approaches on cross-domain transfer learning and show the potential of self-learning to cross-lingual transfer learning.

Arkadipta De ◽  
Dibyanayan Bandyopadhyay ◽  
Baban Gain ◽  
Asif Ekbal

Fake news classification is one of the most interesting problems that has attracted huge attention to the researchers of artificial intelligence, natural language processing, and machine learning (ML). Most of the current works on fake news detection are in the English language, and hence this has limited its widespread usability, especially outside the English literate population. Although there has been a growth in multilingual web content, fake news classification in low-resource languages is still a challenge due to the non-availability of an annotated corpus and tools. This article proposes an effective neural model based on the multilingual Bidirectional Encoder Representations from Transformer (BERT) for domain-agnostic multilingual fake news classification. Large varieties of experiments, including language-specific and domain-specific settings, are conducted. The proposed model achieves high accuracy in domain-specific and domain-agnostic experiments, and it also outperforms the current state-of-the-art models. We perform experiments on zero-shot settings to assess the effectiveness of language-agnostic feature transfer across different languages, showing encouraging results. Cross-domain transfer experiments are also performed to assess language-independent feature transfer of the model. We also offer a multilingual multidomain fake news detection dataset of five languages and seven different domains that could be useful for the research and development in resource-scarce scenarios.

2022 ◽  
Vol 213 ◽  
pp. 105270
Hermann Bulf ◽  
Chiara Capparini ◽  
Elena Nava ◽  
Maria Dolores de Hevia ◽  
Viola Macchi Cassia

2021 ◽  
Tomochika Fujisawa ◽  
Victor Noguerales ◽  
Emmanouil Meramveliotakis ◽  
Anna Papadopoulou ◽  
Alfried P Vogler

Complex bulk samples of invertebrates from biodiversity surveys present a great challenge for taxonomic identification, especially if obtained from unexplored ecosystems. High-throughput imaging combined with machine learning for rapid classification could overcome this bottleneck. Developing such procedures requires that taxonomic labels from an existing source data set are used for model training and prediction of an unknown target sample. Yet the feasibility of transfer learning for the classification of unknown samples remains to be tested. Here, we assess the efficiency of deep learning and domain transfer algorithms for family-level classification of below-ground bulk samples of Coleoptera from understudied forests of Cyprus. We trained neural network models with images from local surveys versus global databases of above-ground samples from tropical forests and evaluated how prediction accuracy was affected by: (a) the quality and resolution of images, (b) the size and complexity of the training set and (c) the transferability of identifications across very disparate source-target pairs that do not share any species or genera. Within-dataset classification accuracy reached 98% and depended on the number and quality of training images and on dataset complexity. The accuracy of between-datasets predictions was reduced to a maximum of 82% and depended greatly on the standardisation of the imaging procedure. When the source and target images were of similar quality and resolution, albeit from different faunas, the reduction of accuracy was minimal. Application of algorithms for domain adaptation significantly improved the prediction performance of models trained by non-standardised, low-quality images. Our findings demonstrate that existing databases can be used to train models and successfully classify images from unexplored biota, when the imaging conditions and classification algorithms are carefully considered. Also, our results provide guidelines for data acquisition and algorithmic development for high-throughput image-based biodiversity surveys.

2021 ◽  
Vol 2021 ◽  
pp. 1-19
Chunfeng Guo ◽  
Bin Wei ◽  
Kun Yu

Automatic biology image classification is essential for biodiversity conservation and ecological study. Recently, due to the record-shattering performance, deep convolutional neural networks (DCNNs) have been used more often in biology image classification. However, training DCNNs requires a large amount of labeled data, which may be difficult to collect for some organisms. This study was carried out to exploit cross-domain transfer learning for DCNNs with limited data. According to the literature, previous studies mainly focus on transferring from ImageNet to a specific domain or transferring between two closely related domains. While this study explores deep transfer learning between species from different domains and analyzes the situation when there is a huge difference between the source domain and the target domain. Inspired by the analysis of previous studies, the effect of biology cross-domain image classification in transfer learning is proposed. In this work, the multiple transfer learning scheme is designed to exploit deep transfer learning on several biology image datasets from different domains. There may be a huge difference between the source domain and the target domain, causing poor performance on transfer learning. To address this problem, multistage transfer learning is proposed by introducing an intermediate domain. The experimental results show the effectiveness of cross-domain transfer learning and the importance of data amount and validate the potential of multistage transfer learning.

2021 ◽  
Vol 3 (4) ◽  
Yang Tian ◽  
Yaoyuan Wang ◽  
Ziyang Zhang ◽  
Pei Sun

2021 ◽  
pp. 107976
Erik Otović ◽  
Marko Njirjak ◽  
Dario Jozinović ◽  
Goran Mauša ◽  
Alberto Michelini ◽  

2021 ◽  
Yu Xia ◽  
Changqing Shen ◽  
Zaigang Chen ◽  
Lin Kong ◽  
Weiguo Huang ◽  

Sign in / Sign up

Export Citation Format

Share Document