XKFitler

Weidong Yang; Fei Fang; Nan Li; Zhongyu (Joan) Lu

doi:10.4018/ijirr.2011010101

XKFitler

International Journal of Information Retrieval Research ◽

10.4018/ijirr.2011010101 ◽

2011 ◽

Vol 1 (1) ◽

pp. 1-18 ◽

Cited By ~ 1

Author(s):

Weidong Yang ◽

Fei Fang ◽

Nan Li ◽

Zhongyu (Joan) Lu

Keyword(s):

Common Ancestor ◽

Keyword Search ◽

Stream Processing ◽

Query Languages ◽

Information Discovery ◽

Text Documents ◽

Filter System ◽

Lowest Common Ancestor ◽

Xml Stream ◽

User Friendly

Most existing XML stream processing systems adopt full structured query languages, such as XPath or XQuery, but they are difficult for ordinary users to learn and use. Keyword search is a user-friendly information discovery technique that has been extensively studied for text documents. This paper presents an XML stream filter system called XKFitler, which is the first system for supporting keyword search over XML stream. In XKFitler, the concepts of XLCA (eXclusive Lowest Common Ancestor) and XLCA Connecting Tree (XLCACT) are used to define the search semantic and results of keywords, and present an approach to filter XML stream according to keywords. The prototype XKFilter is implemented in the experiments.

Download Full-text

XKFilter

Information Retrieval Methods for Multidisciplinary Applications ◽

10.4018/978-1-4666-3898-3.ch001 ◽

2013 ◽

pp. 1-18

Author(s):

Weidong Yang ◽

Fei Fang ◽

Nan Li ◽

Zhongyu (Joan) Lu

Keyword(s):

Common Ancestor ◽

Keyword Search ◽

Stream Processing ◽

Query Languages ◽

Information Discovery ◽

Text Documents ◽

Filter System ◽

Lowest Common Ancestor ◽

Xml Stream ◽

User Friendly

Most existing XML stream processing systems adopt full structured query languages, such as XPath or XQuery, but they are difficult for ordinary users to learn and use. Keyword search is a user-friendly information discovery technique that has been extensively studied for text documents. This paper presents an XML stream filter system called XKFilter, which is the first system for supporting keyword search over XML stream. In XKFilter, the concepts of XLCA (eXclusive Lowest Common Ancestor) and XLCA Connecting Tree (XLCACT) are used to define the search semantic and results of keywords, and present an approach to filter XML stream according to keywords. The prototype XKFilter is implemented in the experiments.

Download Full-text

Keyword Search in XML Streams

Advances in Data Mining and Database Management - Design, Performance, and Analysis of Innovative Information Retrieval ◽

10.4018/978-1-4666-1975-3.ch006 ◽

2013 ◽

pp. 73-89

Author(s):

Weidong Yang ◽

Hao Zhu

Keyword(s):

System Architecture ◽

Keyword Search ◽

Query Languages ◽

Filter System ◽

Section 8 ◽

Lowest Common Ancestor ◽

Xml Stream ◽

Keyword Searching ◽

Processing Techniques ◽

Xml Streams

Most existing XML stream processing techniques adopt full structured query languages such as XPath or XQuery, which are difficult for ordinary users to learn and use. This chapter presents an XML stream filter system called XKFitler, which uses keyword to filter XML streams. In XKFitler, we use the concepts of XLCA (eXclusive Lowest Common Ancestor) and XLCA Connecting Tree (XLCACT) to define the search semantic and results of keywords, and present an approach to filter XML stream according to keywords. In section 1, the background of keyword search in XML streams is introduced. Section 2 explains the searching results. In section 3, a stack-based keyword searching algorithm for XML stream filtering without schemas is presented in-depth. Section 4 presents a keyword search over XML streams by using schema information. The system architecture of XKFilter is described in section 5. Section 6 is the experiments to show the performance. Section 7 discusses the related work. Section 8 is the summaries of this chapter.

Download Full-text

Foundation of Keyword Search in XML

Advances in Data Mining and Database Management - Design, Performance, and Analysis of Innovative Information Retrieval ◽

10.4018/978-1-4666-1975-3.ch001 ◽

2013 ◽

pp. 1-16

Author(s):

Weidong Yang ◽

Hao Zhu

Keyword(s):

Common Ancestor ◽

Keyword Search ◽

Query Languages ◽

Search Algorithms ◽

The Other ◽

Inverted Index ◽

Xml Database ◽

Xml Keyword Search ◽

Lowest Common Ancestor ◽

Structured Information

It has become desirable to provide a way of keyword search for users to query structured information in an XML database (data-centric retrieval) by combining database and information retrieval techniques. Therefore, the key challenges of keyword search in the XML database are how to define appropriate result models meeting user’s search intents, how to search the results by using efficient algorithms, and how to ranking the results. In this chapter, on one hand, the authors present the foundational knowledge of XML keyword search such as XML data models, XML query languages, inverted index, and Dewey encoding. On the other hand, some existing typical researches of keyword search in XML are presented, including the results models such as Smallest Lowest Common Ancestor (SLCA), Exclusive Lowest Common Ancestor (ELCA), Meaningful Lowest Common Ancestor (MLCA), the related search algorithms, and the ranking approaches.

Download Full-text

Research and Implementation of XML Keyword Search Algorithm Based on Semantic Relatives

Advanced Materials Research ◽

10.4028/www.scientific.net/amr.267.811 ◽

2011 ◽

Vol 267 ◽

pp. 811-815

Author(s):

Ming Yan Shen ◽

Xin Li ◽

Xiang Fu Meng

Keyword(s):

Common Ancestor ◽

Keyword Search ◽

Search Algorithm ◽

Search Method ◽

Document Structure ◽

Xml Documents ◽

Xml Keyword Search ◽

Lowest Common Ancestor ◽

Xml Document

The XML keyword search has been used widely in the application of XML documents. Most of the XML keyword search approaches are based on the LCA (lowest common ancestor) or its variants, which usually leads to the un-ideal recall and precision. This paper presents a novel XML keyword search method which based on semantic relatives. The method fully considers the semantic characteristics of the XML document structure. Based on the stack, the algorithm is also presented to merge the semantic relative nodes containing the keyword as the results of XML keyword search. The results of experiments have been identified the efficient and efficiency of our method.

Download Full-text

Ontology-Driven Keyword Search for Heterogeneous XML Data Sources

Advances in Data Mining and Database Management - Design, Performance, and Analysis of Innovative Information Retrieval ◽

10.4018/978-1-4666-1975-3.ch003 ◽

2013 ◽

pp. 31-47

Author(s):

Weidong Yang ◽

Hao Zhu

Keyword(s):

System Architecture ◽

Keyword Search ◽

Query Language ◽

Data Sources ◽

Information Discovery ◽

Xml Data ◽

Section 8 ◽

Data Source ◽

Index Building ◽

User Friendly

Massive heterogeneous XML data sources emerge on the Internet nowadays. These data sources are generally autonomous and provide search interfaces of XML query language such as XPath or XQuery. Accordingly, users need to learn complex syntaxes and know the schemas. Keyword Search is a user-friendly information discovery technique, which can assist users in obtaining useful information conveniently without knowing the schemas, and is very helpful to search heterogeneous XML data. In this chapter, the authors present a system called SKeyword which provides a common keyword search interface for heterogeneous XML data sources, and employs OWL ontology to represent the global model of various data sources. Section 1 introduces the context of keyword search for heterogeneous XML data source. In Section 2, the preliminary knowledge is given, and the semantics of keyword search result in ontology is defined. In section 3, the system architecture is described. Section 4 presents the approaches of ontology integration and index building used by SKeyword. Section 5 presents the generation algorithm of searching results and discusses how to rewrite the keyword search of global conceptual model to into the XQuery sentences for local XML sources. Section 6 discussed how to organize and rank the results. Section 7 shows the experiments. Section 8 is the related work. Section 9 is the conclusion of this chapter.

Download Full-text

XML Keyword Search Algorithm Based on Level-Traverse Encoding

Applied Mechanics and Materials ◽

10.4028/www.scientific.net/amm.263-266.1553 ◽

2012 ◽

Vol 263-266 ◽

pp. 1553-1558

Author(s):

Quan Zhu Yao ◽

Bing Tian ◽

Wang Yun He

Keyword(s):

Common Ancestor ◽

Keyword Search ◽

Search Algorithm ◽

Bottom Up ◽

Xml Documents ◽

Xml Keyword Search ◽

Lowest Common Ancestor ◽

Definition Of ◽

Labeling Method

For XML documents, existing keyword retrieval methods encode each node with Dewey encoding, comparing Dewey encodings part by part is necessary in LCA computation. When the depth of XML is large, lots of LCA computations will affect the performance of keyword search. In this paper we propose a novel labeling method called Level-TRaverse (LTR) encoding, combine with the definition of the result set based on Exclusive Lowest Common Ancestor (ELCA),design a query Bottom-Up Level Algorithm(BULA).The experiments demonstrate this method improves the efficiency and the veracity of XML keyword retrieval.

Download Full-text

Review on Keyword Search and Ranking Techniques for Semi-Structured Data

Knowledge-Intensive Economies and Opportunities for Social, Organizational, and Technological Growth - Advances in Knowledge Acquisition, Transfer, and Management ◽

10.4018/978-1-5225-7347-0.ch013 ◽

2019 ◽

pp. 248-270

Author(s):

Dayananda P. ◽

Sowmyarani C. N.

Keyword(s):

Data Storage ◽

Common Ancestor ◽

Keyword Search ◽

Structured Data ◽

Important Task ◽

Xml Data ◽

Lowest Common Ancestor ◽

Storage Hierarchy

The size of semi-structured data is increasing continuously. Handling semi-structured data efficiently is a challenging task. Keyword search is an important task, and required information can be retrieved without having knowledge of data storage hierarchy. There are several challenges in handling XML data. This chapter discusses various challenges in terms of lowest common ancestor (LCA) semantics, processing of queries efficiently, retrieving top-k results for user needed data. The existing approach is defined under many classes based on how the problem and solution are tackled. Analysis of keyword search and ranking techniques for retrieving desired information are discussed in detail.

Download Full-text

XML Stream Processing

Encyclopedia of Database Systems ◽

10.1007/978-1-4614-8265-9_473 ◽

2018 ◽

pp. 4803-4807

Author(s):

Christoph Koch

Keyword(s):

Stream Processing ◽

Xml Stream

Download Full-text

On the Distribution of Depths in Increasing Trees

The Electronic Journal of Combinatorics ◽

10.37236/409 ◽

2010 ◽

Vol 17 (1) ◽

Author(s):

Markus Kuba ◽

Stephan Wagner

Keyword(s):

Common Ancestor ◽

Short Note ◽

Bijective Proof ◽

Lowest Common Ancestor ◽

Recursive Trees

By a theorem of Dobrow and Smythe, the depth of the $k$th node in very simple families of increasing trees (which includes, among others, binary increasing trees, recursive trees and plane ordered recursive trees) follows the same distribution as the number of edges of the form $j-(j+1)$ with $j < k$. In this short note, we present a simple bijective proof of this fact, which also shows that the result actually holds within a wider class of increasing trees. We also discuss some related results that follow from the bijection as well as a possible generalization. Finally, we use another similar bijection to determine the distribution of the depth of the lowest common ancestor of two nodes.

Download Full-text

Foundation of XML Stream Processing Techniques

Advances in Data Mining and Database Management - Design, Performance, and Analysis of Innovative Information Retrieval ◽

10.4018/978-1-4666-1975-3.ch004 ◽

2013 ◽

pp. 48-59

Author(s):

Weidong Yang ◽

Hao Zhu

Keyword(s):

Optimization Technique ◽

Stream Processing ◽

Research Community ◽

Processing Methods ◽

Xml Data ◽

Research Status ◽

Basic Concepts ◽

Xml Stream ◽

Processing Techniques ◽

Area Section

The problem of processing streaming XML data is gaining widespread attention from the research community, and various XML stream processing methods are put forward, including automaton-based methods, index-based methods, and so forth. In this chapter, the basic concepts and several existing typical approaches of XML stream processing are discussed. Section 1 introduces the background and current research status of this area. Section 2 focuses on the discussion of automaton-based methods, for example, X/YFilter, XPush, et cetera. In section 3, the index-based methods are given. In section 4, other methods such us Fist and XTrie are discussed briefly. Section 4 discusses some optimization technique of XML stream processing. Section 5 summarizes this chapter.

Download Full-text