Integration of semantic-based bipartite graph representation and mutual refinement strategy for biomedical literature clustering

Author(s):  
Illhoi Yoo ◽  
Xiaohua Hu ◽  
Il-Yeol Song
Author(s):  
Jie Cheng ◽  
Lu Lian ◽  
Zichen Xu ◽  
Dan Wu ◽  
Haoyang Zhu ◽  
...  

Sensors ◽  
2021 ◽  
Vol 22 (1) ◽  
pp. 3
Author(s):  
Giacomo Frisoni ◽  
Gianluca Moro ◽  
Giulio Carlassare ◽  
Antonella Carbonaro

The automatic extraction of biomedical events from the scientific literature has drawn keen interest in the last several years, recognizing complex and semantically rich graphical interactions otherwise buried in texts. However, very few works revolve around learning embeddings or similarity metrics for event graphs. This gap leaves biological relations unlinked and prevents the application of machine learning techniques to promote discoveries. Taking advantage of recent deep graph kernel solutions and pre-trained language models, we propose Deep Divergence Event Graph Kernels (DDEGK), an unsupervised inductive method to map events into low-dimensional vectors, preserving their structural and semantic similarities. Unlike most other systems, DDEGK operates at a graph level and does not require task-specific labels, feature engineering, or known correspondences between nodes. To this end, our solution compares events against a small set of anchor ones, trains cross-graph attention networks for drawing pairwise alignments (bolstering interpretability), and employs transformer-based models to encode continuous attributes. Extensive experiments have been done on nine biomedical datasets. We show that our learned event representations can be effectively employed in tasks such as graph classification, clustering, and visualization, also facilitating downstream semantic textual similarity. Empirical results demonstrate that DDEGK significantly outperforms other state-of-the-art methods.


2021 ◽  
Vol 4 ◽  
Author(s):  
David Gordon ◽  
Panayiotis Petousis ◽  
Henry Zheng ◽  
Davina Zamanzadeh ◽  
Alex A.T. Bui

We present a novel approach for imputing missing data that incorporates temporal information into bipartite graphs through an extension of graph representation learning. Missing data is abundant in several domains, particularly when observations are made over time. Most imputation methods make strong assumptions about the distribution of the data. While novel methods may relax some assumptions, they may not consider temporality. Moreover, when such methods are extended to handle time, they may not generalize without retraining. We propose using a joint bipartite graph approach to incorporate temporal sequence information. Specifically, the observation nodes and edges with temporal information are used in message passing to learn node and edge embeddings and to inform the imputation task. Our proposed method, temporal setting imputation using graph neural networks (TSI-GNN), captures sequence information that can then be used within an aggregation function of a graph neural network. To the best of our knowledge, this is the first effort to use a joint bipartite graph approach that captures sequence information to handle missing data. We use several benchmark datasets to test the performance of our method against a variety of conditions, comparing to both classic and contemporary methods. We further provide insight to manage the size of the generated TSI-GNN model. Through our analysis we show that incorporating temporal information into a bipartite graph improves the representation at the 30% and 60% missing rate, specifically when using a nonlinear model for downstream prediction tasks in regularly sampled datasets and is competitive with existing temporal methods under different scenarios.


2020 ◽  
Author(s):  
Angelyn Lao ◽  
Heriberto Cabezas ◽  
Ákos Orosz ◽  
Ferenc Friedler ◽  
Raymond Tan

We propose a process graph (P-graph) approach to develop ecosystem networks from knowledge of the properties of the component species. Originally developed as a process engineering tool for designing industrial plants, the P-graph framework has key advantages over conventional ecological network analysis (ENA) techniques. A P-graph is a bipartite graph consisting of two types of nodes, which we propose to represent components of an ecosystem. Compartments within ecosystems (e.g., organism species) are represented by one class of nodes, while the roles or functions that they play relative to other compartments are represented by a second class of nodes. This bipartite graph representation enables a powerful, unambiguous representation of relationships among ecosystem compartments, which can come in tangible (e.g., mass flow in predation) or intangible form (e.g., symbiosis). For example, within a P-graph, the distinct roles of bees as pollinators for some plants and as prey for some animals can be explicitly represented, which would not otherwise be possible using conventional ENA. After a discussion of the mapping of ecosystems into P-graph, we also discuss how this framework can be used to guide understanding of complex networks that exist in nature. Two component algorithms of P-graph, namely maximal structure generation (MSG) and solution structure generation (SSG), are shown to be particularly useful for ENA. This method can be used to determine the (a) effects of loss of specific ecosystem compartments due to extinction, (b) potential efficacy of ecosystem reconstruction efforts, and (c) maximum sustainable exploitation of human ecosystem services by humans. We illustrate the use of P-graph for the analysis of ecosystem compartment loss using a small-scale stylized case study, and further propose a new criticality index that can be easily derived from SSG results.


1966 ◽  
Vol 05 (03) ◽  
pp. 142-146
Author(s):  
A. Kent ◽  
P. J. Vinken

A joint center has been established by the University of Pittsburgh and the Excerpta Medica Foundation. The basic objective of the Center is to seek ways in which the health sciences community may achieve increasingly convenient and economical access to scientific findings. The research center will make use of facilities and resources of both participating institutions. Cooperating from the University of Pittsburgh will be the School of Medicine, the Computation and Data Processing Center, and the Knowledge Availability Systems (KAS) Center. The KAS Center is an interdisciplinary organization engaging in research, operations, and teaching in the information sciences.Excerpta Medica Foundation, which is the largest international medical abstracting service in the world, with offices in Amsterdam, New York, London, Milan, Tokyo and Buenos Aires, will draw on its permanent medical staff of 54 specialists in charge of the 35 abstracting journals and other reference works prepared and published by the Foundation, the 700 eminent clinicians and researchers represented on its International Editorial Boards, and the 6,000 physicians who participate in its abstracting programs throughout the world. Excerpta Medica will also make available to the Center its long experience in the field, as well as its extensive resources of medical information accumulated during the Foundation’s twenty years of existence. These consist of over 1,300,000 English-language _abstract of the world’s biomedical literature, indexes to its abstracting journals, and the microfilm library in which complete original texts of all the 3,000 primary biomedical journals, monitored by Excerpta Medica in Amsterdam are stored since 1960.The objectives of the program of the combined Center include: (1) establishing a firm base of user relevance data; (2) developing improved vocabulary control mechanisms; (3) developing means of determining confidence limits of vocabulary control mechanisms in terms of user relevance data; 4. developing and field testing of new or improved media for providing medical literature to users; 5. developing methods for determining the relationship between learning and relevance in medical information storage and retrieval systems’; and (6) exploring automatic methods for retrospective searching of the specialized indexes of Excerpta Medica.The priority projects to be undertaken by the Center are (1) the investigation of the information needs of medical scientists, and (2) the development of a highly detailed Master List of Biomedical Indexing Terms. Excerpta Medica has already been at work on the latter project for several years.


Sign in / Sign up

Export Citation Format

Share Document