DocR-BERT: Document-level R-BERT for Chemical-induced Disease Relation Extraction via Gaussian Probability Distribution

Author(s):  
Zhengguang Li ◽  
Heng Chen ◽  
Ruihua Qi ◽  
Hongfei Lin ◽  
Huayue Chen
2020 ◽  
Vol 36 (15) ◽  
pp. 4323-4330 ◽  
Author(s):  
Cong Sun ◽  
Zhihao Yang ◽  
Leilei Su ◽  
Lei Wang ◽  
Yin Zhang ◽  
...  

Abstract Motivation The biomedical literature contains a wealth of chemical–protein interactions (CPIs). Automatically extracting CPIs described in biomedical literature is essential for drug discovery, precision medicine, as well as basic biomedical research. Most existing methods focus only on the sentence sequence to identify these CPIs. However, the local structure of sentences and external biomedical knowledge also contain valuable information. Effective use of such information may improve the performance of CPI extraction. Results In this article, we propose a novel neural network-based approach to improve CPI extraction. Specifically, the approach first employs BERT to generate high-quality contextual representations of the title sequence, instance sequence and knowledge sequence. Then, the Gaussian probability distribution is introduced to capture the local structure of the instance. Meanwhile, the attention mechanism is applied to fuse the title information and biomedical knowledge, respectively. Finally, the related representations are concatenated and fed into the softmax function to extract CPIs. We evaluate our proposed model on the CHEMPROT corpus. Our proposed model is superior in performance as compared with other state-of-the-art models. The experimental results show that the Gaussian probability distribution and external knowledge are complementary to each other. Integrating them can effectively improve the CPI extraction performance. Furthermore, the Gaussian probability distribution can effectively improve the extraction performance of sentences with overlapping relations in biomedical relation extraction tasks. Availability and implementation Data and code are available at https://github.com/CongSun-dlut/CPI_extraction. Supplementary information Supplementary data are available at Bioinformatics online.


2021 ◽  
pp. 580-595
Author(s):  
Zhenyu Zhang ◽  
Bowen Yu ◽  
Xiaobo Shu ◽  
Tingwen Liu

2019 ◽  
Vol 2019 ◽  
pp. 1-18 ◽  
Author(s):  
Xingwang Huang ◽  
Chaopeng Li ◽  
Yunming Pu ◽  
Bingyan He

Quantum-behaved bat algorithm with mean best position directed (QMBA) is a novel variant of bat algorithm (BA) with good performance. However, the QMBA algorithm generates all stochastic coefficients with uniform probability distribution, which can only provide a relatively small search range, so it still faces a certain degree of premature convergence. In order to help bats escape from the local optimum, this article proposes a novel Gaussian quantum bat algorithm with mean best position directed (GQMBA), which applies Gaussian probability distribution to generate random number sequences. Applying Gaussian distribution instead of uniform distribution to generate random coefficients in GQMBA is an effective technique to promote the performance in avoiding premature convergence. In this article, the combination of QMBA and Gaussian probability distribution is applied to solve the numerical function optimization problem. Nineteen benchmark functions are employed and compared with other algorithms to evaluate the accuracy and performance of GQMBA. The experimental results show that, in most cases, the proposed GQMBA algorithm can provide better search performance.


Author(s):  
Zhenyu Zhang ◽  
Bowen Yu ◽  
Xiaobo Shu ◽  
Tingwen Liu ◽  
Hengzhu Tang ◽  
...  

1984 ◽  
Vol 1 (19) ◽  
pp. 35 ◽  
Author(s):  
Michel K. Ochi ◽  
Wei-Chi Wang

This paper presents the results of a study on non-Gaussian characteristic of coastal waves. From the results of the statistical analysis of more than 500 records obtained in the growing stage of the storm, the parameters involved in the non-Gaussian probability distribution which are significant for predicting wave characteristics are clarified, and these parameters are expressed as a function of water depth and sea severity. The limiting sea severity below which the wind-generated coastal waves are considered to be Gaussian is obtained for a given water depth.


2020 ◽  
Vol 11 (06) ◽  
pp. 436-446
Author(s):  
A. T. Adeniran ◽  
O. Faweya ◽  
T. O. Ogunlade ◽  
K. O. Balogun

10.2196/17638 ◽  
2020 ◽  
Vol 8 (7) ◽  
pp. e17638
Author(s):  
Jian Wang ◽  
Xiaoyu Chen ◽  
Yu Zhang ◽  
Yijia Zhang ◽  
Jiabin Wen ◽  
...  

Background Automatically extracting relations between chemicals and diseases plays an important role in biomedical text mining. Chemical-disease relation (CDR) extraction aims at extracting complex semantic relationships between entities in documents, which contain intrasentence and intersentence relations. Most previous methods did not consider dependency syntactic information across the sentences, which are very valuable for the relations extraction task, in particular, for extracting the intersentence relations accurately. Objective In this paper, we propose a novel end-to-end neural network based on the graph convolutional network (GCN) and multihead attention, which makes use of the dependency syntactic information across the sentences to improve CDR extraction task. Methods To improve the performance of intersentence relation extraction, we constructed a document-level dependency graph to capture the dependency syntactic information across sentences. GCN is applied to capture the feature representation of the document-level dependency graph. The multihead attention mechanism is employed to learn the relatively important context features from different semantic subspaces. To enhance the input representation, the deep context representation is used in our model instead of traditional word embedding. Results We evaluate our method on CDR corpus. The experimental results show that our method achieves an F-measure of 63.5%, which is superior to other state-of-the-art methods. In the intrasentence level, our method achieves a precision, recall, and F-measure of 59.1%, 81.5%, and 68.5%, respectively. In the intersentence level, our method achieves a precision, recall, and F-measure of 47.8%, 52.2%, and 49.9%, respectively. Conclusions The GCN model can effectively exploit the across sentence dependency information to improve the performance of intersentence CDR extraction. Both the deep context representation and multihead attention are helpful in the CDR extraction task.


2019 ◽  
Author(s):  
Yuan Yao ◽  
Deming Ye ◽  
Peng Li ◽  
Xu Han ◽  
Yankai Lin ◽  
...  

Author(s):  
Huiwei Zhou ◽  
Yibin Xu ◽  
Weihong Yao ◽  
Zhe Liu ◽  
Chengkun Lang ◽  
...  

Sign in / Sign up

Export Citation Format

Share Document