EHR Coding with Multi-scale Feature Attention and Structured Knowledge Graph Propagation

Author(s):  
Xiancheng Xie ◽  
Yun Xiong ◽  
Philip S. Yu ◽  
Yangyong Zhu
Author(s):  
Yujie Chen ◽  
Tengfei Ma ◽  
Xixi Yang ◽  
Jianmin Wang ◽  
Bosheng Song ◽  
...  

Abstract Motivation Adverse drug–drug interactions (DDIs) are crucial for drug research and mainly cause morbidity and mortality. Thus, the identification of potential DDIs is essential for doctors, patients and the society. Existing traditional machine learning models rely heavily on handcraft features and lack generalization. Recently, the deep learning approaches that can automatically learn drug features from the molecular graph or drug-related network have improved the ability of computational models to predict unknown DDIs. However, previous works utilized large labeled data and merely considered the structure or sequence information of drugs without considering the relations or topological information between drug and other biomedical objects (e.g. gene, disease and pathway), or considered knowledge graph (KG) without considering the information from the drug molecular structure. Results Accordingly, to effectively explore the joint effect of drug molecular structure and semantic information of drugs in knowledge graph for DDI prediction, we propose a multi-scale feature fusion deep learning model named MUFFIN. MUFFIN can jointly learn the drug representation based on both the drug-self structure information and the KG with rich bio-medical information. In MUFFIN, we designed a bi-level cross strategy that includes cross- and scalar-level components to fuse multi-modal features well. MUFFIN can alleviate the restriction of limited labeled data on deep learning models by crossing the features learned from large-scale KG and drug molecular graph. We evaluated our approach on three datasets and three different tasks including binary-class, multi-class and multi-label DDI prediction tasks. The results showed that MUFFIN outperformed other state-of-the-art baselines. Availability and implementation The source code and data are available at https://github.com/xzenglab/MUFFIN.


Author(s):  
Shiyao Wang ◽  
Minlie Huang ◽  
Zhidong Deng

Text classification is a fundamental problem in natural language processing. As a popular deep learning model, convolutional neural network (CNN) has demonstrated great success in this task. However, most existing CNN models apply convolution filters of fixed window size, thereby unable to learn variable n-gram features flexibly. In this paper, we present a densely connected CNN with multi-scale feature attention for text classification. The dense connections build short-cut paths between upstream and downstream convolutional blocks, which enable the model to compose features of larger scale from those of smaller scale, and thus produce variable n-gram features. Furthermore, a multi-scale feature attention is developed to adaptively select multi-scale features for classification. Extensive experiments demonstrate that our model obtains competitive performance against state-of-the-art baselines on five benchmark datasets. Attention visualization further reveals the model's ability to select proper n-gram features for text classification.


Entropy ◽  
2021 ◽  
Vol 23 (4) ◽  
pp. 403
Author(s):  
Xun Zhang ◽  
Lanyan Yang ◽  
Bin Zhang ◽  
Ying Liu ◽  
Dong Jiang ◽  
...  

The problem of extracting meaningful data through graph analysis spans a range of different fields, such as social networks, knowledge graphs, citation networks, the World Wide Web, and so on. As increasingly structured data become available, the importance of being able to effectively mine and learn from such data continues to grow. In this paper, we propose the multi-scale aggregation graph neural network based on feature similarity (MAGN), a novel graph neural network defined in the vertex domain. Our model provides a simple and general semi-supervised learning method for graph-structured data, in which only a very small part of the data is labeled as the training set. We first construct a similarity matrix by calculating the similarity of original features between all adjacent node pairs, and then generate a set of feature extractors utilizing the similarity matrix to perform multi-scale feature propagation on graphs. The output of multi-scale feature propagation is finally aggregated by using the mean-pooling operation. Our method aims to improve the model representation ability via multi-scale neighborhood aggregation based on feature similarity. Extensive experimental evaluation on various open benchmarks shows the competitive performance of our method compared to a variety of popular architectures.


2021 ◽  
Vol 32 (2) ◽  
Author(s):  
Mehrdad Sheoiby ◽  
Sadegh Aliakbarian ◽  
Saeed Anwar ◽  
Lars Petersson

Sign in / Sign up

Export Citation Format

Share Document