scholarly journals Disease ontologies for knowledge graphs

2021 ◽  
Vol 22 (1) ◽  
Author(s):  
Natalja Kurbatova ◽  
Rowan Swiers

Abstract Background Data integration to build a biomedical knowledge graph is a challenging task. There are multiple disease ontologies used in data sources and publications, each having its hierarchy. A common task is to map between ontologies, find disease clusters and finally build a representation of the chosen disease area. There is a shortage of published resources and tools to facilitate interactive, efficient and flexible cross-referencing and analysis of multiple disease ontologies commonly found in data sources and research. Results Our results are represented as a knowledge graph solution that uses disease ontology cross-references and facilitates switching between ontology hierarchies for data integration and other tasks. Conclusions Grakn core with pre-installed “Disease ontologies for knowledge graphs” facilitates the biomedical knowledge graph build and provides an elegant solution for the multiple disease ontologies problem.

2019 ◽  
Vol 1 (3) ◽  
pp. 201-223 ◽  
Author(s):  
Guohui Xiao ◽  
Linfang Ding ◽  
Benjamin Cogrel ◽  
Diego Calvanese

In this paper, we present the virtual knowledge graph (VKG) paradigm for data integration and access, also known in the literature as Ontology-based Data Access. Instead of structuring the integration layer as a collection of relational tables, the VKG paradigm replaces the rigid structure of tables with the flexibility of graphs that are kept virtual and embed domain knowledge. We explain the main notions of this paradigm, its tooling ecosystem and significant use cases in a wide range of applications. Finally, we discuss future research directions.


2021 ◽  
Author(s):  
Florin Ratajczak ◽  
Mitchell Joblin ◽  
Martin Ringsquandl ◽  
Marcel Hildebrandt

Abstract Background Drug repurposing aims at finding new targets for already developed drugs. It becomes more relevant as the cost of discovering new drugs steadily increases. To find new potential targets for a drug, an abundance of methods and existing biomedical knowledge from different domains can be leveraged. Recently, knowledge graphs have emerged in the biomedical domain that integrate information about genes, drugs, diseases and other biological domains. Knowledge graphs can be used to predict new connections between compounds and diseases, leveraging the interconnected biomedical data around them. While real world use cases such as drug repurposing are only interested in one specific relation type, widely used knowledge graph embedding models simultaneously optimize over all relation types in the graph. This can lead the models to underfit the data that is most relevant for the desired relation type. We propose a method that leverages domain knowledge in the form of metapaths and use them to filter two biomedical knowledge graphs (Hetionet and DRKG) for the purpose of improving performance on the prediction task of drug repurposing while simultaneously increasing computational efficiency. Results We find that our method reduces the number of entities by 60% on Hetionet and 26% on DRKG, while leading to an improvement in prediction performance of up to 40.8% on Hetionet and 12.4% on DRKG, with an average improvement of 17.5% on Hetionet and 5.1% on DRKG. Additionally, prioritization of antiviral compounds for SARS CoV-2 improves after task-driven filtering is applied. Conclusion Knowledge graphs contain facts that are counter productive for specific tasks, in our case drug repurposing. We also demonstrate that these facts can be removed, resulting in an improved performance in that task and a more efficient learning process.


2019 ◽  
Author(s):  
David N. Nicholson ◽  
Daniel S. Himmelstein ◽  
Casey S. Greene

AbstractKnowledge graphs support multiple research efforts by providing contextual information for biomedical entities, constructing networks, and supporting the interpretation of high-throughput analyses. These databases are populated via some form of manual curation, which is difficult to scale in the context of an increasing publication rate. Data programming is a paradigm that circumvents this arduous manual process by combining databases with simple rules and heuristics written as label functions, which are programs designed to automatically annotate textual data. Unfortunately, writing a useful label function requires substantial error analysis and is a nontrivial task that takes multiple days per function. This makes populating a knowledge graph with multiple nodes and edge types practically infeasible. We sought to accelerate the label function creation process by evaluating the extent to which label functions could be re-used across multiple edge types. We used a subset of an existing knowledge graph centered on disease, compound, and gene entities to evaluate label function re-use. We determined the best label function combination by comparing a baseline database-only model with the same model but added edge-specific or edge-mismatch label functions. We confirmed that adding additional edge-specific rather than edge-mismatch label functions often improves text annotation and shows that this approach can incorporate novel edges into our source knowledge graph. We expect that continued development of this strategy has the potential to swiftly populate knowledge graphs with new discoveries, ensuring that these resources include cutting-edge results.


Algorithms ◽  
2021 ◽  
Vol 14 (8) ◽  
pp. 241
Author(s):  
Lan Huang ◽  
Yuanwei Zhao ◽  
Bo Wang ◽  
Dongxu Zhang ◽  
Rui Zhang ◽  
...  

Knowledge graph-based data integration is a practical methodology for heterogeneous legacy database-integrated service construction. However, it is neither efficient nor economical to build a new cross-domain knowledge graph on top of the schemas of each legacy database for the specific integration application rather than reusing the existing high-quality knowledge graphs. Consequently, a question arises as to whether the existing knowledge graph is compatible with cross-domain queries and with heterogenous schemas of the legacy systems. An effective criterion is urgently needed in order to evaluate such compatibility as it limits the quality upbound of the integration. This research studies the semantic similarity of the schemas from the aspect of properties. It provides a set of in-depth criteria, namely coverage and flexibility, to evaluate the pairwise compatibility between the schemas. It takes advantage of the properties of knowledge graphs to evaluate the overlaps between schemas and defines the weights of entity types in order to perform precise compatibility computation. The effectiveness of the criteria obtained to evaluate the compatibility between knowledge graphs and cross-domain queries is demonstrated using a case study.


2020 ◽  
Vol 2 (2) ◽  
Author(s):  
Suzanna Schmeelk ◽  
Lixin Tao

Many organizations, to save costs, are movinheg to t Bring Your Own Mobile Device (BYOD) model and adopting applications built by third-parties at an unprecedented rate.  Our research examines software assurance methodologies specifically focusing on security analysis coverage of the program analysis for mobile malware detection, mitigation, and prevention.  This research focuses on secure software development of Android applications by developing knowledge graphs for threats reported by the Open Web Application Security Project (OWASP).  OWASP maintains lists of the top ten security threats to web and mobile applications.  We develop knowledge graphs based on the two most recent top ten threat years and show how the knowledge graph relationships can be discovered in mobile application source code.  We analyze 200+ healthcare applications from GitHub to gain an understanding of their software assurance of their developed software for one of the OWASP top ten moble threats, the threat of “Insecure Data Storage.”  We find that many of the applications are storing personally identifying information (PII) in potentially vulnerable places leaving users exposed to higher risks for the loss of their sensitive data.


Electronics ◽  
2021 ◽  
Vol 10 (12) ◽  
pp. 1407
Author(s):  
Peng Wang ◽  
Jing Zhou ◽  
Yuzhang Liu ◽  
Xingchen Zhou

Knowledge graph embedding aims to embed entities and relations into low-dimensional vector spaces. Most existing methods only focus on triple facts in knowledge graphs. In addition, models based on translation or distance measurement cannot fully represent complex relations. As well-constructed prior knowledge, entity types can be employed to learn the representations of entities and relations. In this paper, we propose a novel knowledge graph embedding model named TransET, which takes advantage of entity types to learn more semantic features. More specifically, circle convolution based on the embeddings of entity and entity types is utilized to map head entity and tail entity to type-specific representations, then translation-based score function is used to learn the presentation triples. We evaluated our model on real-world datasets with two benchmark tasks of link prediction and triple classification. Experimental results demonstrate that it outperforms state-of-the-art models in most cases.


2021 ◽  
Vol 13 (5) ◽  
pp. 124
Author(s):  
Jiseong Son ◽  
Chul-Su Lim ◽  
Hyoung-Seop Shim ◽  
Ji-Sun Kang

Despite the development of various technologies and systems using artificial intelligence (AI) to solve problems related to disasters, difficult challenges are still being encountered. Data are the foundation to solving diverse disaster problems using AI, big data analysis, and so on. Therefore, we must focus on these various data. Disaster data depend on the domain by disaster type and include heterogeneous data and lack interoperability. In particular, in the case of open data related to disasters, there are several issues, where the source and format of data are different because various data are collected by different organizations. Moreover, the vocabularies used for each domain are inconsistent. This study proposes a knowledge graph to resolve the heterogeneity among various disaster data and provide interoperability among domains. Among disaster domains, we describe the knowledge graph for flooding disasters using Korean open datasets and cross-domain knowledge graphs. Furthermore, the proposed knowledge graph is used to assist, solve, and manage disaster problems.


Sign in / Sign up

Export Citation Format

Share Document