Semi-supervised Wafer Map Pattern Recognition using Domain-Specific Data Augmentation and Contrastive Learning

Author(s):  
Hanbin Hu ◽  
Chen He ◽  
Peng Li
2019 ◽  
Vol 49 (6) ◽  
pp. 1676-1683 ◽  
Author(s):  
Michael Gadermayr ◽  
Kexin Li ◽  
Madlaine Müller ◽  
Daniel Truhn ◽  
Nils Krämer ◽  
...  

2020 ◽  
Author(s):  
Dean Sumner ◽  
Jiazhen He ◽  
Amol Thakkar ◽  
Ola Engkvist ◽  
Esben Jannik Bjerrum

<p>SMILES randomization, a form of data augmentation, has previously been shown to increase the performance of deep learning models compared to non-augmented baselines. Here, we propose a novel data augmentation method we call “Levenshtein augmentation” which considers local SMILES sub-sequence similarity between reactants and their respective products when creating training pairs. The performance of Levenshtein augmentation was tested using two state of the art models - transformer and sequence-to-sequence based recurrent neural networks with attention. Levenshtein augmentation demonstrated an increase performance over non-augmented, and conventionally SMILES randomization augmented data when used for training of baseline models. Furthermore, Levenshtein augmentation seemingly results in what we define as <i>attentional gain </i>– an enhancement in the pattern recognition capabilities of the underlying network to molecular motifs.</p>


2017 ◽  
Vol 93 (4) ◽  
pp. 177-202 ◽  
Author(s):  
Emily E. Griffith

ABSTRACT Auditors are more likely to identify misstatements in complex estimates if they recognize problematic patterns among an estimate's underlying assumptions. Rich problem representations aid pattern recognition, but auditors likely have difficulty developing them given auditors' limited domain-specific expertise in this area. In two experiments, I predict and find that a relational cue in a specialist's work highlighting aggressive assumptions improves auditors' problem representations and subsequent judgments about estimates. However, this improvement only occurs when a situational factor (e.g., risk) increases auditors' epistemic motivation to incorporate the cue into their problem representations. These results suggest that auditors do not always respond to cues in specialists' work. More generally, this study highlights the role of situational factors in increasing auditors' epistemic motivation to develop rich problem representations, which contribute to high-quality audit judgments in this and other domains where pattern recognition is important.


2020 ◽  
Author(s):  
Geoffrey Schau ◽  
Erik Burlingame ◽  
Young Hwan Chang

AbstractDeep learning systems have emerged as powerful mechanisms for learning domain translation models. However, in many cases, complete information in one domain is assumed to be necessary for sufficient cross-domain prediction. In this work, we motivate a formal justification for domain-specific information separation in a simple linear case and illustrate that a self-supervised approach enables domain translation between data domains while filtering out domain-specific data features. We introduce a novel approach to identify domainspecific information from sets of unpaired measurements in complementary data domains by considering a deep learning cross-domain autoencoder architecture designed to learn shared latent representations of data while enabling domain translation. We introduce an orthogonal gate block designed to enforce orthogonality of input feature sets by explicitly removing non-sharable information specific to each domain and illustrate separability of domain-specific information on a toy dataset.


2020 ◽  
Vol 8 ◽  
pp. 141-155
Author(s):  
Kai Sun ◽  
Dian Yu ◽  
Dong Yu ◽  
Claire Cardie

Machine reading comprehension tasks require a machine reader to answer questions relevant to the given document. In this paper, we present the first free-form multiple-Choice Chinese machine reading Comprehension dataset (C3), containing 13,369 documents (dialogues or more formally written mixed-genre texts) and their associated 19,577 multiple-choice free-form questions collected from Chinese-as-a-second-language examinations. We present a comprehensive analysis of the prior knowledge (i.e., linguistic, domain-specific, and general world knowledge) needed for these real-world problems. We implement rule-based and popular neural methods and find that there is still a significant performance gap between the best performing model (68.5%) and human readers (96.0%), especiallyon problems that require prior knowledge. We further study the effects of distractor plausibility and data augmentation based on translated relevant datasets for English on model performance. We expect C3 to present great challenges to existing systems as answering 86.8% of questions requires both knowledge within and beyond the accompanying document, and we hope that C3 can serve as a platform to study how to leverage various kinds of prior knowledge to better understand a given written or orally oriented text. C3 is available at https://dataset.org/c3/ .


2019 ◽  
Vol 134 ◽  
pp. 62-71 ◽  
Author(s):  
Yongxin Liu ◽  
Jianqiang Li ◽  
Zhong Ming ◽  
Houbing Song ◽  
Xiaoxiong Weng ◽  
...  

Sign in / Sign up

Export Citation Format

Share Document