Three-Layer Dynamic Transfer Learning Language Model for E. Coli Promoter Classification

Author(s):  
Ying He ◽  
Zhen Shen ◽  
Qinhu Zhang ◽  
Siguo Wang ◽  
Changan Yuan ◽  
...  
Author(s):  
Shu Jiang ◽  
Zuchao Li ◽  
Hai Zhao ◽  
Bao-Liang Lu ◽  
Rui Wang

In recent years, the research on dependency parsing focuses on improving the accuracy of the domain-specific (in-domain) test datasets and has made remarkable progress. However, there are innumerable scenarios in the real world that are not covered by the dataset, namely, the out-of-domain dataset. As a result, parsers that perform well on the in-domain data usually suffer from significant performance degradation on the out-of-domain data. Therefore, to adapt the existing in-domain parsers with high performance to a new domain scenario, cross-domain transfer learning methods are essential to solve the domain problem in parsing. This paper examines two scenarios for cross-domain transfer learning: semi-supervised and unsupervised cross-domain transfer learning. Specifically, we adopt a pre-trained language model BERT for training on the source domain (in-domain) data at the subword level and introduce self-training methods varied from tri-training for these two scenarios. The evaluation results on the NLPCC-2019 shared task and universal dependency parsing task indicate the effectiveness of the adopted approaches on cross-domain transfer learning and show the potential of self-learning to cross-lingual transfer learning.


Gene ◽  
1979 ◽  
Vol 7 (3-4) ◽  
pp. 271-288 ◽  
Author(s):  
Robert W. West ◽  
Rachael L. Neve ◽  
Raymond L. Rodriguez

2021 ◽  
Author(s):  
Lele Yu ◽  
Shaowu Zhang ◽  
Yijia Zhang ◽  
Hongfei Lin

BACKGROUND Happiness refers to the joyful and pleasant emotions that humans produce subjectively. It is the positive part of emotions, and it affects the quality of human life. Therefore, understanding human happiness is a meaningful task in sentiment analysis. We mainly discuss two facets (Agency/Sociality) of happiness in this study. Through analysis and research on happiness, we can expand on new concepts that define happiness and enrich our understanding of emotions. OBJECTIVE In this paper, we treated each happy moment as a sequence of short sentences, then proposed a short happiness detection model based on transfer learning to analyze the Agency and Sociality aspects of happiness. METHODS Happiness analysis is a novel and challenging research task. However, the current dataset in the field of happiness is small. To solve this problem,we utilized the unlabeled training set and transfer learning to train a semantically enhanced language model in the target domain. Then, the trained language model with domain characteristics was further combined with other deep learning models to obtain various models. Finally, we used the improved voting strategy to further improve the experimental results. RESULTS The proposed approach was evaluated on the public dataset. Experimental results showed that our approach significantly outperforms the baselines. When predicting the Agency aspect of happiness, our approach achieved an accuracy of 0.8574 and an F1 score of 0.90, repectively. When predicting Sociality, our approach achieved an accuracy of 0.928 and an F1 score of 0.9360, respectively. CONCLUSIONS Through the evaluation of the dataset, the comparison results demonstrated the effectiveness of our approach for happiness analysis. Experimental results confirmed that our method achieved state-of-the-art performance and transfer learning effectively improved happiness analysis.


2021 ◽  
Author(s):  
Federico Siano ◽  
Peter Wysocki

We introduce and apply machine transfer learning methods to analyze accounting disclosures. We use the examples of the new BERT language model and sentiment analysis of quarterly earnings disclosures to demonstrate the key transfer learning concepts of: (i) pre-training on generic "Big Data", (ii) fine-tuning on small accounting datasets, and (iii) using a language model that captures context rather than stand-alone words. Overall, we show that this new approach is easy to implement, uses widely-available and low-cost computing resources, and has superior performance relative to existing textual analysis tools in accounting. We conclude with suggestions for opportunities to apply transfer learning to address important accounting research questions.


Author(s):  
A. Evtushenko

Machine learning language models are combinations of algorithms and neural networks designed for text processing composed in natural language (Natural Language Processing, NLP).  In 2020, the largest language model from the artificial intelligence research company OpenAI, GPT-3, was released, the maximum number of parameters of which reaches 175 billion. The parameterization of the model increased by more than 100 times made it possible to improve the quality of generated texts to a level that is hard to distinguish from human-written texts. It is noteworthy that this model was trained on a training dataset mainly collected from open sources on the Internet, the volume of which is estimated at 570 GB.  This article discusses the problem of memorizing critical information, in particular, personal data of individual, at the stage of training large language models (GPT-2/3 and derivatives), and also describes an algorithmic approach to solving this problem, which consists in additional preprocessing training dataset and refinement of the model inference in the context of generating pseudo-personal data and embedding into the results of work on the tasks of summarization, text generation, formation of answers to questions and others from the field of seq2seq.


Sign in / Sign up

Export Citation Format

Share Document