scholarly journals Approximate Knowledge Graph Query Answering: From Ranking to Binary Classification

Author(s):  
Ruud van Bakel ◽  
Teodor Aleksiev ◽  
Daniel Daza ◽  
Dimitrios Alivanistos ◽  
Michael Cochez

AbstractLarge, heterogeneous datasets are characterized by missing or even erroneous information. This is more evident when they are the product of community effort or automatic fact extraction methods from external sources, such as text. A special case of the aforementioned phenomenon can be seen in knowledge graphs, where this mostly appears in the form of missing or incorrect edges and nodes.Structured querying on such incomplete graphs will result in incomplete sets of answers, even if the correct entities exist in the graph, since one or more edges needed to match the pattern are missing. To overcome this problem, several algorithms for approximate structured query answering have been proposed. Inspired by modern Information Retrieval metrics, these algorithms produce a ranking of all entities in the graph, and their performance is further evaluated based on how high in this ranking the correct answers appear.In this work we take a critical look at this way of evaluation. We argue that performing a ranking-based evaluation is not sufficient to assess methods for complex query answering. To solve this, we introduce Message Passing Query Boxes (MPQB), which takes binary classification metrics back into use and shows the effect this has on the recently proposed query embedding method MPQE.

2020 ◽  
Vol 2020 (12) ◽  
Author(s):  
A.I. Semenikhin ◽  
◽  
D.V. Semenikhin ◽  

The problem of arbitrary excitation of waves by a system of external sources near an anisotropic metasurface in the form of an elliptical cylinder with a surface homogenized impedance tensor of general form is solved. The solution to the problem is written as a superposition of E- and H-waves in elliptical coordinates. The partial reflection coefficients of waves were found from the boundary conditions using the orthogonality of the Mathieu angular functions. For these coefficients, four coupled infinite systems of linear algebraic equations of the second kind are obtained. The conditions under which the solution of the excitation problem by the method of eigenfunctions is obtained in an explicit form are found and analyzed. It is shown that for this, the surface impedance tensor of a uniform metasurface must belong to a class of deviators (have zero diagonal elements). In the particular case of a mutual (most easily realized) metasurface, its impedance tensor should only be reactance. In another special case, the impedance tensor of a set of deviators describes a class of anisotropic nonreciprocal metasurfaces with the so-called perfect electromagnetic conductivity (PEMC).


2020 ◽  
Vol 2020 ◽  
pp. 1-10
Author(s):  
Fanlin Shen ◽  
Siyi Cheng ◽  
Zhu Li ◽  
Keqiang Yue ◽  
Wenjun Li ◽  
...  

Obstructive sleep apnea-hypopnea syndrome (OSAHS) is extremely harmful to the human body and may cause neurological dysfunction and endocrine dysfunction, resulting in damage to multiple organs and multiple systems throughout the body and negatively affecting the cardiovascular, kidney, and mental systems. Clinically, doctors usually use standard PSG (Polysomnography) to assist diagnosis. PSG determines whether a person has apnea syndrome with multidimensional data such as brain waves, heart rate, and blood oxygen saturation. In this paper, we have presented a method of recognizing OSAHS, which is convenient for patients to monitor themselves in daily life to avoid delayed treatment. Firstly, we theoretically analyzed the difference between the snoring sounds of normal people and OSAHS patients in the time and frequency domains. Secondly, the snoring sounds related to apnea events and the nonapnea related snoring sounds were classified by deep learning, and then, the severity of OSAHS symptoms had been recognized. In the algorithm proposed in this paper, the snoring data features are extracted through the three feature extraction methods, which are MFCC, LPCC, and LPMFCC. Moreover, we adopted CNN and LSTM for classification. The experimental results show that the MFCC feature extraction method and the LSTM model have the highest accuracy rate which was 87% when it is adopted for binary-classification of snoring data. Moreover, the AHI value of the patient can be obtained by the algorithm system which can determine the severity degree of OSAHS.


Author(s):  
Hernán Vargas ◽  
Carlos Buil-Aranda ◽  
Aidan Hogan ◽  
Claudia López

As the adoption of knowledge graphs grows, more and more non-experts users need to be able to explore and query such graphs. These users are not typically familiar with graph query languages such as SPARQL, and may not be familiar with the knowledge graph's structure. In this extended abstract, we provide a summary of our work on a language and visual interface -- called RDF Explorer -- that help non-expert users to navigate and query knowledge graphs. A usability study over Wikidata shows that users successfully complete more tasks with RDF Explorer than with the existing Wikidata Query Helper interface.


2021 ◽  
Author(s):  
Segyu Lee ◽  
Junil Bang ◽  
Sungeun Hong ◽  
Woojung Jang

Drug-target interaction (DTI) is a methodology for predicting the binding affinity between a compound and a target protein, and a key technology in the derivation of candidate substances in drug discovery. As DTI experiments have progressed for a long time, a substantial volume of chemical, biomedical, and pharmaceutical data have accumulated. This accumulation of data has occurred contemporaneously with the advent of the field of big data, and data-based machine learning methods could significantly reduce the time and cost of drug development. In particular, the deep learning method shows potential when applied to the fields of vision and speech recognition, and studies to apply deep learning to various other fields have emerged. Research applying deep learning is underway in drug development, and among various deep learning models, a graph-based model that can effectively learn molecular structures has received more attention as the SOTA in experimental results were achieved. Our study focused on molecular structure information among graph-based models in message passing neural networks. In this paper, we propose a self-attention-based bond and atom message passing neural network which predicts DTI by extracting molecular features through a graph model using an attention mechanism. Model validation experiments were performed after defining binding affinity as a regression and classification problem: binary classification to predict the presence or absence of binding to the drug-target, and regression to predict binding affinity to the drug-target. Classification was performed with BindingDB, and regression was performed with the DAVIS dataset. In the classification problem, ABCnet showed higher performance than MPNN, as it does in the existing study, and in regression, the potential of ABCnet was checked compared to that of SOTA. Experiments indicated that in binary classification, ABCnet has an average performance improvement of 1% than other MPNN on the DTI task, and in regression, ABCnet has CI and performance degradation between 0.01 and 0.02 compared to SOTA.


2019 ◽  
Author(s):  
Adam Struck ◽  
Brian Walsh ◽  
Alexander Buchanan ◽  
Jordan A. Lee ◽  
Ryan Spangler ◽  
...  

AbstractThe analysis of cancer biology data involves extremely heterogeneous datasets including information from RNA sequencing, genome-wide copy number, DNA methylation data reporting on epigenomic regulation, somatic mutations from whole-exome or whole-genome analyses, pathology estimates from imaging sections or subtyping, drug response or other treatment outcomes, and various other clinical and phenotypic measurements. Bringing these different resources into a common framework, with a data model that allows for complex relationships as well as dense vectors of features, will unlock integrative analysis. We introduce a graph database and query engine for discovery and analysis of cancer biology, called the BioMedical Evidence Graph (BMEG). The BMEG is unique from other biological data graphs in that sample level molecular information is connected to reference knowledge bases. It combines gene expression and mutation data, with drug response experiments, pathway information databases and literature derived associations. The construction of the BMEG has resulted in a graph containing over 36M vertices and 29M edges. The BMEG system provides a graph query based API to enable analysis, with client code available for Python, Javascript and R, and a server online at bmeg.io. Using this system we have developed several forms of integrated analysis to demonstrate the utility of the system. The BMEG is an evolving resource dedicated to enabling integrative analysis. We have demonstrated queries on the system that illustrate mutation significance analysis, drug response machine learning, patient level knowledge base queries and pathway level analysis. We have compared the resulting graph to other available integrated graph systems, and demonstrated that it is unique in the scale of the graph and the type of data it makes available.HighlightsData resource connected extremely diverse set of cancer data setsGraph query engine that can be easily deployed and used on new datasetsEasily installed python clientServer online at bmeg.ioSummaryThe analysis of cancer biology data involves extremely heterogeneous datasets including information. Bringing these different resources into a common framework, with a data model that allows for complex relationships as well as dense vectors of features, will unlock integrative analysis. We introduce a graph database and query engine for discovery and analysis of cancer biology, called the BioMedical Evidence Graph (BMEG). The construction of the BMEG has resulted in a graph containing over 36M vertices and 29M edges. The BMEG system provides a graph query based API to enable analysis, with client code available for Python, Javascript and R, and a server online at bmeg.io. Using this system we have developed several forms of integrated analysis to demonstrate the utility of the system.


2020 ◽  
Author(s):  
Liu Ning ◽  
Kexue Luo

Abstract Background: Spoken responses can provide diagnostic markers as language impairment maybe an important early performance for dementia patients. In this study, an automatic assessment system was proposed to discriminate MCI and AD patients from their speeches so as to achieve the aim of speeding up treatment and slowing down disease progression.Methods: We integrated a group of acoustic, demographic, linguistic features and used machine learning algorithm to effectively predict MCI and AD patients. Additionally, in order to get the best result, comparison experiment is done effectively which includes three different feature extraction methods (e.g. acoustic, text and their combination) and four of the most popular algorithms, namely, Logistic Regression, SVM, Random Forest and LightGBM.Results: According to Iflytek’s dataset “Alzheimer’s disease prediction challenge competition” in 2019, the performance of LightGBM was especially better than other algorithms,the state-of-the-art AUC value of which was between 0.75 and 0.89 in binary classification and across 0.57 in ternary classification. The result also revealed that age had a significant impact on all the proposed cognitive factors in the meanwhile.Conclusions: The results indicate that our method is increasingly useful for assessing suspected AD and MCI by using multiple, complementary acoustic and linguistic measures.


2021 ◽  
Vol 14 (6) ◽  
pp. 943-956
Author(s):  
Efthymia Tsamoura ◽  
David Carral ◽  
Enrico Malizia ◽  
Jacopo Urbani

The chase is a well-established family of algorithms used to materialize Knowledge Bases (KBs) for tasks like query answering under dependencies or data cleaning. A general problem of chase algorithms is that they might perform redundant computations. To counter this problem, we introduce the notion of Trigger Graphs (TGs), which guide the execution of the rules avoiding redundant computations. We present the results of an extensive theoretical and empirical study that seeks to answer when and how TGs can be computed and what are the benefits of TGs when applied over real-world KBs. Our results include introducing algorithms that compute (minimal) TGs. We implemented our approach in a new engine, called GLog, and our experiments show that it can be significantly more efficient than the chase enabling us to materialize Knowledge Graphs with 17B facts in less than 40 min using a single machine with commodity hardware.


2020 ◽  
Author(s):  
Mikhail Galkin ◽  
Priyansh Trivedi ◽  
Gaurav Maheshwari ◽  
Ricardo Usbeck ◽  
Jens Lehmann

Sign in / Sign up

Export Citation Format

Share Document