scholarly journals A Novel Sequence-to-Subgraph Framework for Diagnosis Classification

Author(s):  
Jun Chen ◽  
Quan Yuan ◽  
Chao Lu ◽  
Haifeng Huang

Text-based diagnosis classification is a critical problem in AI-enabled healthcare studies, which assists clinicians in making correct decision and lowering the rate of diagnostic errors. Previous studies follow the routine of sequence based deep learning models in NLP literature to deal with clinical notes. However, recent studies find that structural information is important in clinical contents that greatly impacts the predictions. In this paper, a novel sequence-to-subgraph framework is introduced to process clinical texts for classification, which changes the paradigm of managing texts. Moreover, a new classification model under the framework is proposed that incorporates subgraph convolutional network and hierarchical diagnostic attentive network to extract the layered structural features of clinical texts. The evaluation conducted on both the real-world English and Chinese datasets shows that the proposed method outperforms the state-of-the-art deep learning based diagnosis classification models.

2020 ◽  
Author(s):  
Aman Gupta ◽  
Yadul Raghav

The problem of predicting links has gained much attention in recent years due to its vast application in various domains such as sociology, network analysis, information science, etc. Many methods have been proposed for link prediction such as RA, AA, CCLP, etc. These methods required hand-crafted structural features to calculate the similarity scores between a pair of nodes in a network. Some methods use local structural information while others use global information of a graph. These methods do not tell which properties are better than others. With an in-depth analysis of these methods, we understand that one way to overcome this problem is to consider network structure and node attribute information to capture the discriminative features for link prediction tasks. We proposed a deep learning Autoencoder based Link Prediction (ALP) architecture for the latent representation of a graph, unified with non-negative matrix factorization to automatically determine the underlying roles in a network, after that assigning a mixed-membership of these roles to each node in the network. The idea is to transfer these roles as a feature vector for the link prediction task in the network. Further, cosine similarity is applied after getting the required features to compute the pairwise similarity score between the nodes. We present the performance of the algorithm on the real-world datasets, where it gives the competitive result compared to other algorithms.


2020 ◽  
Vol 21 (S16) ◽  
Author(s):  
Guangjie Zhou ◽  
Jun Wang ◽  
Xiangliang Zhang ◽  
Maozu Guo ◽  
Guoxian Yu

Abstract Background Maize (Zea mays ssp. mays L.) is the most widely grown and yield crop in the world, as well as an important model organism for fundamental research of the function of genes. The functions of Maize proteins are annotated using the Gene Ontology (GO), which has more than 40000 terms and organizes GO terms in a direct acyclic graph (DAG). It is a huge challenge to accurately annotate relevant GO terms to a Maize protein from such a large number of candidate GO terms. Some deep learning models have been proposed to predict the protein function, but the effectiveness of these approaches is unsatisfactory. One major reason is that they inadequately utilize the GO hierarchy. Results To use the knowledge encoded in the GO hierarchy, we propose a deep Graph Convolutional Network (GCN) based model (DeepGOA) to predict GO annotations of proteins. DeepGOA firstly quantifies the correlations (or edges) between GO terms and updates the edge weights of the DAG by leveraging GO annotations and hierarchy, then learns the semantic representation and latent inter-relations of GO terms in the way by applying GCN on the updated DAG. Meanwhile, Convolutional Neural Network (CNN) is used to learn the feature representation of amino acid sequences with respect to the semantic representations. After that, DeepGOA computes the dot product of the two representations, which enable to train the whole network end-to-end coherently. Extensive experiments show that DeepGOA can effectively integrate GO structural information and amino acid information, and then annotates proteins accurately. Conclusions Experiments on Maize PH207 inbred line and Human protein sequence dataset show that DeepGOA outperforms the state-of-the-art deep learning based methods. The ablation study proves that GCN can employ the knowledge of GO and boost the performance. Codes and datasets are available at http://mlda.swu.edu.cn/codes.php?name=DeepGOA.


2020 ◽  
Vol 34 (04) ◽  
pp. 5387-5394
Author(s):  
Hao Peng ◽  
Jianxin Li ◽  
Qiran Gong ◽  
Yuanxin Ning ◽  
Senzhang Wang ◽  
...  

Graph classification is critically important to many real-world applications that are associated with graph data such as chemical drug analysis and social network mining. Traditional methods usually require feature engineering to extract the graph features that can help discriminate the graphs of different classes. Although recently deep learning based graph embedding approaches are proposed to automatically learn graph features, they mostly use a few vertex arrangements extracted from the graph for feature learning, which may lose some structural information. In this work, we present a novel motif-based attentional graph convolution neural network for graph classification, which can learn more discriminative and richer graph features. Specifically, a motif-matching guided subgraph normalization method is developed to better preserve the spatial information. A novel subgraph-level self-attention network is also proposed to capture the different impacts or weights of different subgraphs. Experimental results on both bioinformatics and social network datasets show that the proposed models significantly improve graph classification performance over both traditional graph kernel methods and recent deep learning approaches.


2021 ◽  
Vol 12 (1) ◽  
Author(s):  
Aparna R. Rajpurkar ◽  
Leslie J. Mateo ◽  
Sedona E. Murphy ◽  
Alistair N. Boettiger

AbstractChromatin architecture plays an important role in gene regulation. Recent advances in super-resolution microscopy have made it possible to measure chromatin 3D structure and transcription in thousands of single cells. However, leveraging these complex data sets with a computationally unbiased method has been challenging. Here, we present a deep learning-based approach to better understand to what degree chromatin structure relates to transcriptional state of individual cells. Furthermore, we explore methods to “unpack the black box” to determine in an unbiased manner which structural features of chromatin regulation are most important for gene expression state. We apply this approach to an Optical Reconstruction of Chromatin Architecture dataset of the Bithorax gene cluster in Drosophila and show it outperforms previous contact-focused methods in predicting expression state from 3D structure. We find the structural information is distributed across the domain, overlapping and extending beyond domains identified by prior genetic analyses. Individual enhancer-promoter interactions are a minor contributor to predictions of activity.


2018 ◽  
Author(s):  
Shreshth Gandhi ◽  
Leo J. Lee ◽  
Andrew Delong ◽  
David Duvenaud ◽  
Brendan J. Frey

AbstractMotivationDetermining RNA binding protein(RBP) binding specificity is crucial for understanding many cellular processes and genetic disorders. RBP binding is known to be affected by both the sequence and structure of RNAs. Deep learning can be used to learn generalizable representations of raw data and has improved state of the art in several fields such as image classification, speech recognition and even genomics. Previous work on RBP binding has either used shallow models that combine sequence and structure or deep models that use only the sequence. Here we combine both abilities by augmenting and refining the original Deepbind architecture to capture structural information and obtain significantly better performance.ResultsWe propose two deep architectures, one a lightweight convolutional network for transcriptome wide inference and another a Long Short-Term Memory(LSTM) network that is suitable for small batches of data. We incorporate computationally predicted secondary structure features as input to our models and show its effectiveness in boosting prediction performance. Our models achieved significantly higher correlations on held out in-vitro test data compared to previous approaches, and generalise well to in-vivo CLIP-SEQ data achieving higher median AUCs than other approaches. We analysed the output from our model for VTS1 and CPO and provided intuition into its working. Our models confirmed known secondary structure preferences for some proteins as well as found new ones where secondary structure might play a role. We also demonstrated the strengths of our model compared to other approaches such as the ability to combine information from long distances along the input.AvailabilitySoftware and models are available at https://github.com/shreshthgandhi/[email protected], [email protected]


PLoS ONE ◽  
2021 ◽  
Vol 16 (9) ◽  
pp. e0257013
Author(s):  
Xiaolong Hu ◽  
Liejun Wang ◽  
Shuli Cheng ◽  
Yongming Li

The cardinal symptoms of some ophthalmic diseases observed through exceptional retinal blood vessels, such as retinal vein occlusion, diabetic retinopathy, etc. The advanced deep learning models used to obtain morphological and structural information of blood vessels automatically are conducive to the early treatment and initiative prevention of ophthalmic diseases. In our work, we propose a hierarchical dilation convolutional network (HDC-Net) to extract retinal vessels in a pixel-to-pixel manner. It utilizes the hierarchical dilation convolution (HDC) module to capture the fragile retinal blood vessels usually neglected by other methods. An improved residual dual efficient channel attention (RDECA) module can infer more delicate channel information to reinforce the discriminative capability of the model. The structured Dropblock can help our HDC-Net model to solve the network overfitting effectively. From a holistic perspective, the segmentation results obtained by HDC-Net are superior to other deep learning methods on three acknowledged datasets (DRIVE, CHASE-DB1, STARE), the sensitivity, specificity, accuracy, f1-score and AUC score are {0.8252, 0.9829, 0.9692, 0.8239, 0.9871}, {0.8227, 0.9853, 0.9745, 0.8113, 0.9884}, and {0.8369, 0.9866, 0.9751, 0.8385, 0.9913}, respectively. It surpasses most other advanced retinal vessel segmentation models. Qualitative and quantitative analysis demonstrates that HDC-Net can fulfill the task of retinal vessel segmentation efficiently and accurately.


Author(s):  
R.M. Glaeser ◽  
S.B. Hayward

Highly ordered or crystalline biological macromolecules become severely damaged and structurally disordered after a brief electron exposure. Evidence that damage and structural disorder are occurring is clearly given by the fading and eventual disappearance of the specimen's electron diffraction pattern. The fading and disappearance of sharp diffraction spots implies a corresponding disappearance of periodic structural features in the specimen. By the same token, there is a oneto- one correspondence between the disappearance of the crystalline diffraction pattern and the disappearance of reproducible structural information that can be observed in the images of identical unit cells of the object structure. The electron exposures that result in a significant decrease in the diffraction intensity will depend somewhat upon the resolution (Bragg spacing) involved, and can vary considerably with the chemical makeup and composition of the specimen material.


1987 ◽  
Vol 26 (01) ◽  
pp. 13-23 ◽  
Author(s):  
H. W. Gottinger

AbstractThe purpose of this paper is to report on an expert system in design that screens for potential hazards from environmental chemicals on the basis of structure-activity relationships in the study of chemical carcinogenesis, particularly with respect to analyzing the current state of known structural information about chemical carcinogens and predicting the possible carcinogenicity of untested chemicals. The structure-activity tree serves as an index of known chemical structure features associated with carcinogenic activity. The basic units of the tree are the principal recognized classes of chemical carcinogens that are subdivided into subclasses known as nodes according to specific structural features that may reflect differences in carcinogenic potential among chemicals in the class. An analysis of a computerized data base of known carcinogens (knowledge base) is proposed using the structure-activity tree in order to test the validity of the tree as a classification scheme (inference engine).


2019 ◽  
Author(s):  
Zachary VanAernum ◽  
Florian Busch ◽  
Benjamin J. Jones ◽  
Mengxuan Jia ◽  
Zibo Chen ◽  
...  

It is important to assess the identity and purity of proteins and protein complexes during and after protein purification to ensure that samples are of sufficient quality for further biochemical and structural characterization, as well as for use in consumer products, chemical processes, and therapeutics. Native mass spectrometry (nMS) has become an important tool in protein analysis due to its ability to retain non-covalent interactions during measurements, making it possible to obtain protein structural information with high sensitivity and at high speed. Interferences from the presence of non-volatiles are typically alleviated by offline buffer exchange, which is timeconsuming and difficult to automate. We provide a protocol for rapid online buffer exchange (OBE) nMS to directly screen structural features of pre-purified proteins, protein complexes, or clarified cell lysates. Information obtained by OBE nMS can be used for fast (<5 min) quality control and can further guide protein expression and purification optimization.


Sign in / Sign up

Export Citation Format

Share Document