Domain Adaptation Algorithm based on Manifold Regularization

Abstract Introduction Classifying whether concepts in an unstructured clinical text are negated is an important unsolved task. New domain adaptation and transfer learning methods can potentially address this issue. Objective We examine neural unsupervised domain adaptation methods, introducing a novel combination of domain adaptation with transformer-based transfer learning methods to improve negation detection. We also want to better understand the interaction between the widely used bidirectional encoder representations from transformers (BERT) system and domain adaptation methods. Materials and Methods We use 4 clinical text datasets that are annotated with negation status. We evaluate a neural unsupervised domain adaptation algorithm and BERT, a transformer-based model that is pretrained on massive general text datasets. We develop an extension to BERT that uses domain adversarial training, a neural domain adaptation method that adds an objective to the negation task, that the classifier should not be able to distinguish between instances from 2 different domains. Results The domain adaptation methods we describe show positive results, but, on average, the best performance is obtained by plain BERT (without the extension). We provide evidence that the gains from BERT are likely not additive with the gains from domain adaptation. Discussion Our results suggest that, at least for the task of clinical negation detection, BERT subsumes domain adaptation, implying that BERT is already learning very general representations of negation phenomena such that fine-tuning even on a specific corpus does not lead to much overfitting. Conclusion Despite being trained on nonclinical text, the large training sets of models like BERT lead to large gains in performance for the clinical negation detection task.

Download Full-text

Causal Embeddings for Recommendation: An Extended Abstract

Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2019/870 ◽

2019 ◽

Author(s):

Flavian Vasile ◽

Stephen Bonner

Keyword(s):

Treatment Effect ◽

Domain Adaptation ◽

State Of The Art ◽

Adaptation Algorithm ◽

Natural Behavior ◽

New Approaches ◽

Product Sales

Recommendations are commonly used to modify user’s natural behavior, for example, increasing product sales or the time spent on a website. This results in a gap between the ultimate business ob- jective and the classical setup where recommenda- tions are optimized to be coherent with past user be- havior. To bridge this gap, we propose a new learn- ing setup for recommendation that optimizes for the Incremental Treatment Effect (ITE) of the policy. We show this is equivalent to learning to predict recommendation outcomes under a fully random recommendation policy and propose a new domain adaptation algorithm that learns from logged data containing outcomes from a biased recommenda- tion policy and predicts recommendation outcomes according to random exposure. We compare our method against state-of-the-art factorization meth- ods, in addition to new approaches of causal rec- ommendation and show significant improvements.

Download Full-text

Bias invariant RNA-seq metadata annotation

10.1101/2020.11.26.399568 ◽

2020 ◽

Author(s):

Hannes Wartmann ◽

Sven Heins ◽

Karin Kloiber ◽

Stefan Bonn

Keyword(s):

Deep Learning ◽

Domain Adaptation ◽

Tissue Sample ◽

Large Data ◽

Biomedical Data ◽

Rna Seq ◽

Adaptation Algorithm ◽

Data Repositories ◽

Technological Advances ◽

Metadata Annotation

AbstractRecent technological advances have resulted in an unprecedented increase in publicly available biomedical data, yet the reuse of the data is often precluded by experimental bias and a lack of annotation depth and consistency. Here we investigate RNA-seq metadata prediction based on gene expression values. We present a deep-learning based domain adaptation algorithm for the automatic annotation of RNA-seq metadata. We show how our algorithm outperforms existing approaches as well as traditional deep learning methods for the prediction of tissue, sample source, and patient sex information across several large data repositories. By using a model architecture similar to siamese networks the algorithm is able to learn biases from datasets with few samples. Our domain adaptation approach achieves metadata annotation accuracies up to 12.3% better than a previously published method. Lastly, we provide a list of more than 10,000 novel tissue and sex label annotations for 8,495 unique SRA samples.

Download Full-text