Variational Deep Logic Network for Joint Inference of Entities and Relations

Abstract Nowadays, deep learning models have been widely adopted and achieved promising results on various application domains. Despite of their intriguing performance, most deep learning models function as black-boxes, lacking explicit reasoning capabilities and explanations, which are usually essential for complex problems. Take joint inference in information extraction as an example. This task requires the identification of multiple structured knowledge from texts, which is inter-correlated, including entities, events and the relationships between them. Various deep neural networks have been proposed to jointly perform entity extraction and relation prediction, which only propagate information implicitly via representation learning. However, they fail to encode the intensive correlations between entity types and relations to enforce their co-existence. On the other hand, some approaches adopt rules to explicitly constrain certain relational facts. However, the separation of rules with representation learning usually restrains the approaches with error propagation. Moreover, the pre-defined rules are inflexible and might bring negative effects when data is noisy. To address these limitations, we propose a variational deep logic network that incorporates both representation learning and relational reasoning via the variational EM algorithm. The model consists of a deep neural network to learn high-level features with implicit interactions via the self-attention mechanism and a relational logic network to explicitly exploit target interactions. These two components are trained interactively to bring the best of both worlds. We conduct extensive experiments ranging from fine-grained sentiment terms extraction, end-to-end relation prediction to end-to-end event extraction to demonstrate the effectiveness of our proposed method.

Download Full-text

Deep learning for detecting logic-flaw-exploiting network attacks: An end-to-end approach

Journal of Computer Security ◽

10.3233/jcs-210101 ◽

2021 ◽

pp. 1-30

Author(s):

Qingtian Zou ◽

Anoop Singhal ◽

Xiaoyan Sun ◽

Peng Liu

Keyword(s):

Deep Learning ◽

Attack Detection ◽

Network Data ◽

Learning Models ◽

Real World Data ◽

Network Attacks ◽

Network Attack ◽

Public Network ◽

Network Intrusion ◽

End To End

Network attacks have become a major security concern for organizations worldwide. A category of network attacks that exploit the logic (security) flaws of a few widely-deployed authentication protocols has been commonly observed in recent years. Such logic-flaw-exploiting network attacks often do not have distinguishing signatures, and can thus easily evade the typical signature-based network intrusion detection systems. Recently, researchers have applied neural networks to detect network attacks with network logs. However, public network data sets have major drawbacks such as limited data sample variations and unbalanced data with respect to malicious and benign samples. In this paper, we present a new end-to-end approach based on protocol fuzzing to automatically generate high-quality network data, on which deep learning models can be trained for network attack detection. Our findings show that protocol fuzzing can generate data samples that cover real-world data, and deep learning models trained with fuzzed data can successfully detect the logic-flaw-exploiting network attacks.

Download Full-text

A Comparative Study of End-To-End Discriminative Deep Learning Models for Knee Joint Kinematic Time Series Classification

2019 IEEE Signal Processing in Medicine and Biology Symposium (SPMB) ◽

10.1109/spmb47826.2019.9037831 ◽

2019 ◽

Author(s):

M. Abid ◽

A. Mitiche ◽

Y. Ouakrim ◽

P. A. Vendittoli ◽

A. Fuentes ◽

...

Keyword(s):

Time Series ◽

Deep Learning ◽

Knee Joint ◽

Comparative Study ◽

Time Series Classification ◽

Learning Models ◽

Joint Kinematic ◽

End To End

Download Full-text

A comparison of deep learning models for end-to-end face-based video retrieval in unconstrained videos

Neural Computing and Applications ◽

10.1007/s00521-021-06875-x ◽

2022 ◽

Author(s):

Gioele Ciaparrone ◽

Leonardo Chiariglione ◽

Roberto Tagliaferri

Keyword(s):

Deep Learning ◽

Video Retrieval ◽

Large Datasets ◽

Validation Dataset ◽

Learning Models ◽

Average Precision ◽

Query Image ◽

Shot Detection ◽

Independent Test ◽

End To End

AbstractFace-based video retrieval (FBVR) is the task of retrieving videos that containing the same face shown in the query image. In this article, we present the first end-to-end FBVR pipeline that is able to operate on large datasets of unconstrained, multi-shot, multi-person videos. We adapt an existing audiovisual recognition dataset to the task of FBVR and use it to evaluate our proposed pipeline. We compare a number of deep learning models for shot detection, face detection, and face feature extraction as part of our pipeline on a validation dataset made of more than 4000 videos. We obtain 97.25% mean average precision on an independent test set, composed of more than 1000 videos. The pipeline is able to extract features from videos at $$\sim $$ ∼ 7 times the real-time speed, and it is able to perform a query on thousands of videos in less than 0.5 s.

Download Full-text

Training confounder-free deep learning models for medical applications

Nature Communications ◽

10.1038/s41467-020-19784-9 ◽

2020 ◽

Vol 11 (1) ◽

Author(s):

Qingyu Zhao ◽

Ehsan Adeli ◽

Kilian M. Pohl

Keyword(s):

Machine Learning ◽

Deep Learning ◽

Bone Age ◽

Magnetic Resonance Images ◽

Data Sets ◽

Large Set ◽

Learning Models ◽

End To End ◽

The Impact ◽

The Relationship

AbstractThe presence of confounding effects (or biases) is one of the most critical challenges in using deep learning to advance discovery in medical imaging studies. Confounders affect the relationship between input data (e.g., brain MRIs) and output variables (e.g., diagnosis). Improper modeling of those relationships often results in spurious and biased associations. Traditional machine learning and statistical models minimize the impact of confounders by, for example, matching data sets, stratifying data, or residualizing imaging measurements. Alternative strategies are needed for state-of-the-art deep learning models that use end-to-end training to automatically extract informative features from large set of images. In this article, we introduce an end-to-end approach for deriving features invariant to confounding factors while accounting for intrinsic correlations between the confounder(s) and prediction outcome. The method does so by exploiting concepts from traditional statistical methods and recent fair machine learning schemes. We evaluate the method on predicting the diagnosis of HIV solely from Magnetic Resonance Images (MRIs), identifying morphological sex differences in adolescence from those of the National Consortium on Alcohol and Neurodevelopment in Adolescence (NCANDA), and determining the bone age from X-ray images of children. The results show that our method can accurately predict while reducing biases associated with confounders. The code is available at https://github.com/qingyuzhao/br-net.

Download Full-text

A merged molecular representation learning for molecular properties prediction with a web-based service

Scientific Reports ◽

10.1038/s41598-021-90259-7 ◽

2021 ◽

Vol 11 (1) ◽

Author(s):

Hyunseob Kim ◽

Jeongcheol Lee ◽

Sunil Ahn ◽

Jongsuk Ruth Lee

Keyword(s):

Deep Learning ◽

Quantitative Estimation ◽

Chemical Properties ◽

Representation Learning ◽

Fine Tuning ◽

Learning Models ◽

Web Based ◽

Property Prediction ◽

Matrix Embedding ◽

Molecular Properties Prediction

AbstractDeep learning has brought a dramatic development in molecular property prediction that is crucial in the field of drug discovery using various representations such as fingerprints, SMILES, and graphs. In particular, SMILES is used in various deep learning models via character-based approaches. However, SMILES has a limitation in that it is hard to reflect chemical properties. In this paper, we propose a new self-supervised method to learn SMILES and chemical contexts of molecules simultaneously in pre-training the Transformer. The key of our model is learning structures with adjacency matrix embedding and learning logics that can infer descriptors via Quantitative Estimation of Drug-likeness prediction in pre-training. As a result, our method improves the generalization of the data and achieves the best average performance by benchmarking downstream tasks. Moreover, we develop a web-based fine-tuning service to utilize our model on various tasks.

Download Full-text

Extracting Family History of Patients From Clinical Narratives: Exploring an End-to-End Solution With Deep Learning Models (Preprint)

10.2196/preprints.22982 ◽

2020 ◽

Author(s):

Xi Yang ◽

Hansi Zhang ◽

Xing He ◽

Jiang Bian ◽

Yonghui Wu

Keyword(s):

Deep Learning ◽

Family History ◽

Information Extraction ◽

Language Processing ◽

Conditional Random Fields ◽

Short Term Memory ◽

Majority Voting ◽

Learning Models ◽

Concept Extraction ◽

End To End

BACKGROUND Patients’ family history (FH) is a critical risk factor associated with numerous diseases. However, FH information is not well captured in the structured database but often documented in clinical narratives. Natural language processing (NLP) is the key technology to extract patients’ FH from clinical narratives. In 2019, the National NLP Clinical Challenge (n2c2) organized shared tasks to solicit NLP methods for FH information extraction. OBJECTIVE This study presents our end-to-end FH extraction system developed during the 2019 n2c2 open shared task as well as the new transformer-based models that we developed after the challenge. We seek to develop a machine learning–based solution for FH information extraction without task-specific rules created by hand. METHODS We developed deep learning–based systems for FH concept extraction and relation identification. We explored deep learning models including long short-term memory-conditional random fields and bidirectional encoder representations from transformers (BERT) as well as developed ensemble models using a majority voting strategy. To further optimize performance, we systematically compared 3 different strategies to use BERT output representations for relation identification. RESULTS Our system was among the top-ranked systems (3 out of 21) in the challenge. Our best system achieved micro-averaged F1 scores of 0.7944 and 0.6544 for concept extraction and relation identification, respectively. After challenge, we further explored new transformer-based models and improved the performances of both subtasks to 0.8249 and 0.6775, respectively. For relation identification, our system achieved a performance comparable to the best system (0.6810) reported in the challenge. CONCLUSIONS This study demonstrated the feasibility of utilizing deep learning methods to extract FH information from clinical narratives.

Download Full-text

SEVEN: Deep Semi-supervised Verification Networks

Proceedings of the Twenty-Sixth International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2017/358 ◽

2017 ◽

Cited By ~ 4

Author(s):

Vahid Noroozi ◽

Lei Zheng ◽

Sara Bahaadini ◽

Sihong Xie ◽

Philip S. Yu

Keyword(s):

Deep Learning ◽

State Of The Art ◽

Learning Models ◽

Fingerprint Verification ◽

Short Supply ◽

Two Samples ◽

End To End

Verification determines whether two samples belong to the same class or not, and has important applications such as face and fingerprint verification, where thousands or millions of categories are present but each category has scarce labeled examples, presenting two major challenges for existing deep learning models. We propose a deep semi-supervised model named SEmi-supervised VErification Network (SEVEN) to address these challenges. The model consists of two complementary components. The generative component addresses the lack of supervision within each category by learning general salient structures from a large amount of data across categories. The discriminative component exploits the learned general features to mitigate the lack of supervision within categories, and also directs the generative component to find more informative structures of the whole data manifold. The two components are tied together in SEVEN to allow an end-to-end training of the two components. Extensive experiments on four verification tasks demonstrate that SEVEN significantly outperforms other state-of-the-art deep semi-supervised techniques when labeled data are in short supply. Furthermore, SEVEN is competitive with fully supervised baselines trained with a larger amount of labeled data. It indicates the importance of the generative component in SEVEN.

Download Full-text

Deep learning models in detection of dietary supplement adverse event signals from Twitter

JAMIA Open ◽

10.1093/jamiaopen/ooab081 ◽

2021 ◽

Vol 4 (4) ◽

Author(s):

Yefeng Wang ◽

Yunpeng Zhao ◽

Dalton Schutte ◽

Jiang Bian ◽

Rui Zhang

Keyword(s):

Adverse Event ◽

Deep Learning ◽

Knowledge Base ◽

Dietary Supplement ◽

Relation Extraction ◽

Learning Models ◽

Concept Extraction ◽

Annotation Guideline ◽

End To End ◽

Ae Signals

Abstract Objective The objective of this study is to develop a deep learning pipeline to detect signals on dietary supplement-related adverse events (DS AEs) from Twitter. Materials and Methods We obtained 247 807 tweets ranging from 2012 to 2018 that mentioned both DS and AE. We designed a tailor-made annotation guideline for DS AEs and annotated biomedical entities and relations on 2000 tweets. For the concept extraction task, we fine-tuned and compared the performance of BioClinical-BERT, PubMedBERT, ELECTRA, RoBERTa, and DeBERTa models with a CRF classifier. For the relation extraction task, we fine-tuned and compared BERT models to BioClinical-BERT, PubMedBERT, RoBERTa, and DeBERTa models. We chose the best-performing models in each task to assemble an end-to-end deep learning pipeline to detect DS AE signals and compared the results to the known DS AEs from a DS knowledge base (ie, iDISK). Results DeBERTa-CRF model outperformed other models in the concept extraction task, scoring a lenient microaveraged F1 score of 0.866. RoBERTa model outperformed other models in the relation extraction task, scoring a lenient microaveraged F1 score of 0.788. The end-to-end pipeline built on these 2 models was able to extract DS indication and DS AEs with a lenient microaveraged F1 score of 0.666. Conclusion We have developed a deep learning pipeline that can detect DS AE signals from Twitter. We have found DS AEs that were not recorded in an existing knowledge base (iDISK) and our proposed pipeline can as sist DS AE pharmacovigilance.

Download Full-text

A Comparative Study of End-To-End Discriminative Deep Learning Models for Knee Joint Kinematic Time Series Classification

Biomedical Signal Processing ◽

10.1007/978-3-030-67494-6_2 ◽

2020 ◽

pp. 33-61

Author(s):

M. Abid ◽

Y. Ouakrim ◽

A. Mitiche ◽

P. A. Vendittoli ◽

N. Hagemeister ◽

...

Keyword(s):

Time Series ◽

Deep Learning ◽

Knee Joint ◽

Comparative Study ◽

Time Series Classification ◽

Learning Models ◽

Joint Kinematic ◽

End To End

Download Full-text

Using self-supervised feature learning to improve the use of pulse oximeter signals to predict paediatric hospitalization

Wellcome Open Research ◽

10.12688/wellcomeopenres.17148.1 ◽

2021 ◽

Vol 6 ◽

pp. 248

Author(s):

Paul Mwaniki ◽

Timothy Kamanu ◽

Samuel Akech ◽

Dustin Dunsmuir ◽

J. Mark Ansermino ◽

...

Keyword(s):

Machine Learning ◽

Logistic Regression ◽

Deep Learning ◽

Pulse Oximeter ◽

Regression Models ◽

Data Sets ◽

Learning Models ◽

Data Set ◽

Logistic Regression Models ◽

End To End

Background: The success of many machine learning applications depends on knowledge about the relationship between the input data and the task of interest (output), hindering the application of machine learning to novel tasks. End-to-end deep learning, which does not require intermediate feature engineering, has been recommended to overcome this challenge but end-to-end deep learning models require large labelled training data sets often unavailable in many medical applications. In this study, we trained machine learning models to predict paediatric hospitalization given raw photoplethysmography (PPG) signals obtained from a pulse oximeter. We trained self-supervised learning (SSL) for automatic feature extraction from PPG signals and assessed the utility of SSL in initializing end-to-end deep learning models trained on a small labelled data set with the aim of predicting paediatric hospitalization.Methods: We compared logistic regression models fitted using features extracted using SSL with end-to-end deep learning models initialized either randomly or using weights from the SSL model. We also compared the performance of SSL models trained on labelled data alone (n=1,031) with SSL trained using both labelled and unlabelled signals (n=7,578). Results: The SSL model trained on both labelled and unlabelled PPG signals produced features that were more predictive of hospitalization compared to the SSL model trained on labelled PPG only (AUC of logistic regression model: 0.78 vs 0.74). The end-to-end deep learning model had an AUC of 0.80 when initialized using the SSL model trained on all PPG signals, 0.77 when initialized using SSL trained on labelled data only, and 0.73 when initialized randomly. Conclusions: This study shows that SSL can improve the classification of PPG signals by either extracting features required by logistic regression models or initializing end-to-end deep learning models. Furthermore, SSL can leverage larger unlabelled data sets to improve performance of models fitted using small labelled data sets.

Download Full-text