Probabilistic query rewriting for efficient and effective keyword search on graph data

2013 ◽  
Vol 6 (14) ◽  
pp. 1642-1653 ◽  
Author(s):  
Lei Zhang ◽  
Thanh Tran ◽  
Achim Rettinger
2018 ◽  
Vol 14 (3) ◽  
pp. 299-316 ◽  
Author(s):  
Chang-Sup Park

Purpose This paper aims to propose a new keyword search method on graph data to improve the relevance of search results and reduce duplication of content nodes in the answer trees obtained by previous approaches based on distinct root semantics. The previous approaches are restricted to find answer trees having different root nodes and thus often generate a result consisting of answer trees with low relevance to the query or duplicate content nodes. The method allows limited redundancy in the root nodes of top-k answer trees to produce more effective query results. Design/methodology/approach A measure for redundancy in a set of answer trees regarding their root nodes is defined, and according to the metric, a set of answer trees with limited root redundancy is proposed for the result of a keyword query on graph data. For efficient query processing, an index on the useful paths in the graph using inverted lists and a hash map is suggested. Then, based on the path index, a top-k query processing algorithm is presented to find most relevant and diverse answer trees given a maximum amount of root redundancy allowed for a set of answer trees. Findings The results of experiments using real graph datasets show that the proposed approach can produce effective query answers which are more diverse in the content nodes and more relevant to the query than the previous approach based on distinct root semantics. Originality/value This paper first takes redundancy in the root nodes of answer trees into account to improve the relevance and content nodes redundancy of query results over the previous distinct root semantics. It can satisfy the users’ various information need on a large and complex graph data using a keyword-based query.


2013 ◽  
Vol 25 (12) ◽  
pp. 2767-2779 ◽  
Author(s):  
Ye Yuan ◽  
Guoren Wang ◽  
Lei Chen ◽  
Haixun Wang

1996 ◽  
Vol 35 (04/05) ◽  
pp. 309-316 ◽  
Author(s):  
M. R. Lehto ◽  
G. S. Sorock

Abstract:Bayesian inferencing as a machine learning technique was evaluated for identifying pre-crash activity and crash type from accident narratives describing 3,686 motor vehicle crashes. It was hypothesized that a Bayesian model could learn from a computer search for 63 keywords related to accident categories. Learning was described in terms of the ability to accurately classify previously unclassifiable narratives not containing the original keywords. When narratives contained keywords, the results obtained using both the Bayesian model and keyword search corresponded closely to expert ratings (P(detection)≥0.9, and P(false positive)≤0.05). For narratives not containing keywords, when the threshold used by the Bayesian model was varied between p>0.5 and p>0.9, the overall probability of detecting a category assigned by the expert varied between 67% and 12%. False positives correspondingly varied between 32% and 3%. These latter results demonstrated that the Bayesian system learned from the results of the keyword searches.


2020 ◽  
Vol 16 (1) ◽  
Author(s):  
Mona Lundin

This study explores the use of a new protocol in hypertension care, in which continuous patient-generated data reported through digital technology are presented in graphical form and discussed in follow-up consultations with nurses. This protocol is part of an infrastructure design project in which patients and medical professionals are co-designers. The approach used for the study was interaction analysis, which rendered possible detailed in situ examination of local variations in how nurses relate to the protocol. The findings show three distinct engagements: (1) teasing out an average blood pressure, (2) working around the protocol and graph data and (3) delivering an analysis. It was discovered that the graphical representations structured the consultations to a great extent, and that nurses mostly referred to graphs that showed blood pressure values, which is a measurement central to the medical discourse of hypertension. However, it was also found that analysis of the data alone was not sufficient to engage patients: nurses' invisible and inclusion work through eliciting patients' narratives played an important role here. A conclusion of the study is that nurses and patients both need to be more thoroughly introduced to using protocols based on graphs for more productive consultations to be established. 


2019 ◽  
Vol 118 (1) ◽  
pp. 36-41
Author(s):  
Jung-Woo Lee ◽  
Seung-Cheon Kim ◽  
Sung-Hoon Kim ◽  
Jin-Ho Lim

Background/Objectives: In this study, research to improve efficiency of online advertising market, we would like to propose a new performance index called "Leakage Ratio" which can increase the efficiency of advertisement. Methods/Statistical analysis: Naver, the Internet portal site in Korea, is the most influential medium for online keyword search advertising. In this study, Leakage Ratio management is applied to online keyword search ads for five medium and large size online shopping malls at Naver. Based on the performance trend of each search keyword, we tried to improve the efficiency of the whole advertisement by changing the bid of the low efficiency keyword.


Sign in / Sign up

Export Citation Format

Share Document