Integration of semantic-based bipartite graph representation and mutual refinement strategy for biomedical literature clustering

The automatic extraction of biomedical events from the scientific literature has drawn keen interest in the last several years, recognizing complex and semantically rich graphical interactions otherwise buried in texts. However, very few works revolve around learning embeddings or similarity metrics for event graphs. This gap leaves biological relations unlinked and prevents the application of machine learning techniques to promote discoveries. Taking advantage of recent deep graph kernel solutions and pre-trained language models, we propose Deep Divergence Event Graph Kernels (DDEGK), an unsupervised inductive method to map events into low-dimensional vectors, preserving their structural and semantic similarities. Unlike most other systems, DDEGK operates at a graph level and does not require task-specific labels, feature engineering, or known correspondences between nodes. To this end, our solution compares events against a small set of anchor ones, trains cross-graph attention networks for drawing pairwise alignments (bolstering interpretability), and employs transformer-based models to encode continuous attributes. Extensive experiments have been done on nine biomedical datasets. We show that our learned event representations can be effectively employed in tasks such as graph classification, clustering, and visualization, also facilitating downstream semantic textual similarity. Empirical results demonstrate that DDEGK significantly outperforms other state-of-the-art methods.

Download Full-text

Predicting Hospital Readmission Using Graph Representation Learning Based on Patient and Disease Bipartite Graph

Database Systems for Advanced Applications - Lecture Notes in Computer Science ◽

10.1007/978-3-030-59416-9_23 ◽

2020 ◽

pp. 385-397

Author(s):

Zhiqi Liu ◽

Lizhen Cui ◽

Wei Guo ◽

Wei He ◽

Hui Li ◽

...

Keyword(s):

Bipartite Graph ◽

Hospital Readmission ◽

Representation Learning ◽

Graph Representation

Download Full-text

TSI-GNN: Extending Graph Neural Networks to Handle Missing Data in Temporal Settings

Frontiers in Big Data ◽

10.3389/fdata.2021.693869 ◽

2021 ◽

Vol 4 ◽

Author(s):

David Gordon ◽

Panayiotis Petousis ◽

Henry Zheng ◽

Davina Zamanzadeh ◽

Alex A.T. Bui

Keyword(s):

Neural Networks ◽

Missing Data ◽

Bipartite Graph ◽

Message Passing ◽

Representation Learning ◽

Temporal Information ◽

Graph Representation ◽

Sequence Information ◽

Novel Approach ◽

Graph Neural Networks

We present a novel approach for imputing missing data that incorporates temporal information into bipartite graphs through an extension of graph representation learning. Missing data is abundant in several domains, particularly when observations are made over time. Most imputation methods make strong assumptions about the distribution of the data. While novel methods may relax some assumptions, they may not consider temporality. Moreover, when such methods are extended to handle time, they may not generalize without retraining. We propose using a joint bipartite graph approach to incorporate temporal sequence information. Specifically, the observation nodes and edges with temporal information are used in message passing to learn node and edge embeddings and to inform the imputation task. Our proposed method, temporal setting imputation using graph neural networks (TSI-GNN), captures sequence information that can then be used within an aggregation function of a graph neural network. To the best of our knowledge, this is the first effort to use a joint bipartite graph approach that captures sequence information to handle missing data. We use several benchmark datasets to test the performance of our method against a variety of conditions, comparing to both classic and contemporary methods. We further provide insight to manage the size of the generated TSI-GNN model. Through our analysis we show that incorporating temporal information into a bipartite graph improves the representation at the 30% and 60% missing rate, specifically when using a nonlinear model for downstream prediction tasks in regularly sampled datasets and is competitive with existing temporal methods under different scenarios.

Download Full-text

Multiclass Visual Classifier Based on Bipartite Graph Representation of Decision Tables

Lecture Notes in Computer Science - Learning and Intelligent Optimization ◽

10.1007/978-3-642-13800-3_13 ◽

2010 ◽

pp. 169-183

Author(s):

Kazuya Haraguchi ◽

Seok-Hee Hong ◽

Hiroshi Nagamochi

Keyword(s):

Bipartite Graph ◽

Graph Representation ◽

Decision Tables

Download Full-text

Socio-Ecological Network Structures from Process Graphs

10.1101/2020.04.15.042697 ◽

2020 ◽

Author(s):

Angelyn Lao ◽

Heriberto Cabezas ◽

Ákos Orosz ◽

Ferenc Friedler ◽

Raymond Tan

Keyword(s):

Bipartite Graph ◽

Solution Structure ◽

Graph Representation ◽

Small Scale ◽

Ecological Network ◽

Structure Generation ◽

Potential Efficacy ◽

Ecosystem Reconstruction ◽

Process Graphs ◽

Criticality Index

We propose a process graph (P-graph) approach to develop ecosystem networks from knowledge of the properties of the component species. Originally developed as a process engineering tool for designing industrial plants, the P-graph framework has key advantages over conventional ecological network analysis (ENA) techniques. A P-graph is a bipartite graph consisting of two types of nodes, which we propose to represent components of an ecosystem. Compartments within ecosystems (e.g., organism species) are represented by one class of nodes, while the roles or functions that they play relative to other compartments are represented by a second class of nodes. This bipartite graph representation enables a powerful, unambiguous representation of relationships among ecosystem compartments, which can come in tangible (e.g., mass flow in predation) or intangible form (e.g., symbiosis). For example, within a P-graph, the distinct roles of bees as pollinators for some plants and as prey for some animals can be explicitly represented, which would not otherwise be possible using conventional ENA. After a discussion of the mapping of ecosystems into P-graph, we also discuss how this framework can be used to guide understanding of complex networks that exist in nature. Two component algorithms of P-graph, namely maximal structure generation (MSG) and solution structure generation (SSG), are shown to be particularly useful for ENA. This method can be used to determine the (a) effects of loss of specific ecosystem compartments due to extinction, (b) potential efficacy of ecosystem reconstruction efforts, and (c) maximum sustainable exploitation of human ecosystem services by humans. We illustrate the use of P-graph for the analysis of ecosystem compartment loss using a small-scale stylized case study, and further propose a new criticality index that can be easily derived from SSG results.

Download Full-text

The Center for International Biomedical Communications Research

Methods of Information in Medicine ◽

10.1055/s-0038-1636285 ◽

1966 ◽

Vol 05 (03) ◽

pp. 142-146

Author(s):

A. Kent ◽

P. J. Vinken

Keyword(s):

Information Needs ◽

English Language ◽

Medical Information ◽

Biomedical Literature ◽

Information Storage ◽

Control Mechanisms ◽

University Of Pittsburgh ◽

The World ◽

The University ◽

Excerpta Medica Foundation

A joint center has been established by the University of Pittsburgh and the Excerpta Medica Foundation. The basic objective of the Center is to seek ways in which the health sciences community may achieve increasingly convenient and economical access to scientific findings. The research center will make use of facilities and resources of both participating institutions. Cooperating from the University of Pittsburgh will be the School of Medicine, the Computation and Data Processing Center, and the Knowledge Availability Systems (KAS) Center. The KAS Center is an interdisciplinary organization engaging in research, operations, and teaching in the information sciences.Excerpta Medica Foundation, which is the largest international medical abstracting service in the world, with offices in Amsterdam, New York, London, Milan, Tokyo and Buenos Aires, will draw on its permanent medical staff of 54 specialists in charge of the 35 abstracting journals and other reference works prepared and published by the Foundation, the 700 eminent clinicians and researchers represented on its International Editorial Boards, and the 6,000 physicians who participate in its abstracting programs throughout the world. Excerpta Medica will also make available to the Center its long experience in the field, as well as its extensive resources of medical information accumulated during the Foundation’s twenty years of existence. These consist of over 1,300,000 English-language _abstract of the world’s biomedical literature, indexes to its abstracting journals, and the microfilm library in which complete original texts of all the 3,000 primary biomedical journals, monitored by Excerpta Medica in Amsterdam are stored since 1960.The objectives of the program of the combined Center include: (1) establishing a firm base of user relevance data; (2) developing improved vocabulary control mechanisms; (3) developing means of determining confidence limits of vocabulary control mechanisms in terms of user relevance data; 4. developing and field testing of new or improved media for providing medical literature to users; 5. developing methods for determining the relationship between learning and relevance in medical information storage and retrieval systems’; and (6) exploring automatic methods for retrospective searching of the specialized indexes of Excerpta Medica.The priority projects to be undertaken by the Center are (1) the investigation of the information needs of medical scientists, and (2) the development of a highly detailed Master List of Biomedical Indexing Terms. Excerpta Medica has already been at work on the latter project for several years.

Download Full-text