A Highly Adaptable Web Information Extractor Using Graph Data Model

Author(s):  
Qi Guo ◽  
Lizhu Zhou ◽  
Zhiqiang Zhang ◽  
Jianhua Feng
2004 ◽  
pp. 227-267
Author(s):  
Wee Keong Ng ◽  
Zehua Liu ◽  
Zhao Li ◽  
Ee Peng Lim

With the explosion of information on the Web, traditional ways of browsing and keyword searching of information over web pages no longer satisfy the demanding needs of web surfers. Web information extraction has emerged as an important research area that aims to automatically extract information from target web pages and convert them into a structured format for further processing. The main issues involved in the extraction process include: (1) the definition of a suitable extraction language; (2) the definition of a data model representing the web information source; (3) the generation of the data model, given a target source; and (4) the extraction and presentation of information according to a given data model. In this chapter, we discuss the challenges of these issues and the approaches that current research activities have taken to revolve these issues. We propose several classification schemes to classify existing approaches of information extraction from different perspectives. Among the existing works, we focus on the Wiccap system — a software system that enables ordinary end-users to obtain information of interest in a simple and efficient manner by constructing personalized web views of information sources.


1990 ◽  
pp. 7-20
Author(s):  
Hideko S. Kunii
Keyword(s):  

2018 ◽  
Vol 7 (3.34) ◽  
pp. 562 ◽  
Author(s):  
Zhanfang Zaho ◽  
Sung Kook Han ◽  
Ju Ri Kim

Background/Objectives: It is still a challenging issue to represent the reification effectively since the reification representation of RDF standard has been revealed some drawbacks.Methods/Statistical analysis: Currently, there are two main graph data models: RDF and LPG. LPG is a popular graph data model that is usually applied to NoSQL graph databases.This paper derives three types of the reification structures in terms of the structural and semantic relationships of the reification statements. The detailed representation of each type of the reification is presented with the extended LPG model.Findings: This paper proposes a novel approach to represent the reification structure of RDF from the perspective of LPG. The paper explores the formal, conceptual properties of the conventional LPG models and proposes their extension to capture more complex knowledge structures efficiently. These augmentations of LPG can achieve more efficient and flexible resource modeling. This paper derives three types of the reification structures in terms of the structural and semantic relationships of the reification statements: assertion, quantification, and entailment.The proposed approach not only preserves the structure and semantics of the reification but also enables LPG modeling of the complex structural statements to be easy and intuitive.This can contribute to transfer RDF graphs into LPGs.Improvements/Applications: The implementation of the extended LPG and the query processing of the reification remain future work. 


2008 ◽  
pp. 2338-2363
Author(s):  
Susanta Mitra ◽  
Aditya Bagchi ◽  
A. K. Bandyopadhyay

A social network defines the structure of a social community like an organization or institution, covering its members and their inter-relationships. Social relationships among the members of a community can be of different types like friendship, kinship, professional, academic, and so forth. Traditionally, a social network is represented by a directed graph. Analysis of graph structure representing a social network is done by the sociologists to study a community. Hardly any effort has been made to design a data model to store and retrieve social-network-related data. In this paper, an object-relational graph data model has been proposed for modeling a social network. The objective is to illustrate the power of this generic model to represent the common structural and node-based properties of different social network applications. A novel, multi-paradigm architecture has been proposed to efficiently manage the system. New structural operators have been defined in the paper and the application of these operators has been illustrated through query examples. The completeness and the minimality of the operators have also been shown.


Sign in / Sign up

Export Citation Format

Share Document