Search Engine-Based Web Information Extraction

Semantic Web Engineering in the Knowledge Society ◽

10.4018/978-1-60566-112-4.ch009 ◽

2009 ◽

pp. 208-241

Author(s):

Gijs Geleijnse

Keyword(s):

Semantic Web ◽

Information Extraction ◽

Search Engine ◽

Community Based ◽

Web Information Extraction ◽

Structure Information ◽

Web Information ◽

Structured Information ◽

The Web ◽

Standard Semantic

In this chapter we discuss approaches to find, extract, and structure information from natural language texts on the Web. Such structured information can be expressed and shared using the standard Semantic Web languages and hence be machine interpreted. In this chapter we focus on two tasks in Web information extraction. The first part focuses on mining facts from the Web, while in the second part, we present an approach to collect community-based meta-data. A search engine is used to retrieve potentially relevant texts. From these texts, instances and relations are extracted. The proposed approaches are illustrated using various case-studies, showing that we can reliably extract information from the Web using simple techniques.

Download Full-text

Design and Implementation of Web Extraction System of Ceramic Products’ Information in the Business Website

Advanced Materials Research ◽

10.4028/www.scientific.net/amr.989-994.4322 ◽

2014 ◽

Vol 989-994 ◽

pp. 4322-4325

Author(s):

Mu Qing Zhan ◽

Rong Hua Lu

Keyword(s):

Information Extraction ◽

Search Engine ◽

Extraction System ◽

The Internet ◽

Extraction Technology ◽

Web Information Extraction ◽

Design And Implementation ◽

Web Extraction ◽

Web Information ◽

The Web

In the means of getting information from the Internet, the Web information extraction technology which can get more precise and more granular information is different from Search Engine, this article presents the technical route of Web information exaction of ceramic products’ information on the basis of analyzing the developing status of Web information extraction technology at home and abroad, and makes the extraction rules, and develops a set of extraction system, and acquires the relevant ceramic products’ information.

Download Full-text

Intelligent Web Information Extraction Model for Agricultural Product Quality and Safety System

10.54216/jisiot.040203 ◽

2021 ◽

pp. 99-110

Author(s):

Mohammad Ali Tofigh ◽

◽

Zhendong Mu

Keyword(s):

Information Extraction ◽

Product Quality ◽

Hot Spot ◽

Safety System ◽

Agricultural Product ◽

Quality And Safety ◽

Web Information Extraction ◽

Web Information ◽

Product Quality And Safety ◽

The Web

With the development of society, people pay more and more attention to the safety of food, and relevant laws and policies are gradually introduced and being improved. The research and development of agricultural product quality and safety system has become a research hot spot, and how to obtain the Web information of the system effectively and quickly is the focus of the research, so it is essential to carry out the intelligent extraction of Web information for agricultural product quality and safety system. The purpose of this paper is to solve the problem of how to efficiently extract the Web information of the agricultural product quality and safety system. By studying the Web information extraction methods of various systems, the paper makes a detailed analysis and research on how to realize the efficient and intelligent extraction of the Web information of the agricultural product quality and safety system. This paper analyzes in detail all kinds of template information extraction algorithms used at present, and systematically discusses a set of schemes that can automatically extract the Web information of agricultural product quality and safety system according to the template. The research results show that the proposed scheme is a dynamically extensible information extraction system, which can independently implement dynamic configuration templates according to different requirements without changing the code. Compared with the general way, the Web information extraction speed of agricultural product quality safety system is increased by 25%, the accuracy is increased by 12%, and the recall rate is increased by 30%.

Download Full-text

Web Information Extraction via Web Views

Web Information Systems ◽

10.4018/978-1-59140-208-4.ch007 ◽

2004 ◽

pp. 227-267

Author(s):

Wee Keong Ng ◽

Zehua Liu ◽

Zhao Li ◽

Ee Peng Lim

Keyword(s):

Information Extraction ◽

Data Model ◽

Information Source ◽

Extraction Process ◽

Web Pages ◽

Efficient Manner ◽

Web Information Extraction ◽

Web Information ◽

Definition Of ◽

The Web

With the explosion of information on the Web, traditional ways of browsing and keyword searching of information over web pages no longer satisfy the demanding needs of web surfers. Web information extraction has emerged as an important research area that aims to automatically extract information from target web pages and convert them into a structured format for further processing. The main issues involved in the extraction process include: (1) the definition of a suitable extraction language; (2) the definition of a data model representing the web information source; (3) the generation of the data model, given a target source; and (4) the extraction and presentation of information according to a given data model. In this chapter, we discuss the challenges of these issues and the approaches that current research activities have taken to revolve these issues. We propose several classification schemes to classify existing approaches of information extraction from different perspectives. Among the existing works, we focus on the Wiccap system — a software system that enables ordinary end-users to obtain information of interest in a simple and efficient manner by constructing personalized web views of information sources.

Download Full-text

The Web-OEM approach to Web information extraction

Journal of Network and Computer Applications ◽

10.1006/jnca.1999.0095 ◽

1999 ◽

Vol 22 (4) ◽

pp. 259-269 ◽

Cited By ~ 1

Author(s):

Luca Iocchi

Keyword(s):

Information Extraction ◽

Web Information Extraction ◽

Web Information ◽

The Web

Download Full-text

The Web Information Extraction for Update Summarization Based on Shallow Parsing

2011 International Conference on P2P, Parallel, Grid, Cloud and Internet Computing ◽

10.1109/3pgcic.2011.26 ◽

2011 ◽

Cited By ~ 1

Author(s):

Min Peng ◽

Xiaoxiao Ma ◽

Ye Tian ◽

Ming Yang ◽

Hua Long ◽

...

Keyword(s):

Information Extraction ◽

Web Information Extraction ◽

Web Information ◽

Shallow Parsing ◽

The Web

Download Full-text

Research of the Web Information Extraction Technology on Tourism Theme

Applied Mechanics and Materials ◽

10.4028/www.scientific.net/amm.614.503 ◽

2014 ◽

Vol 614 ◽

pp. 503-506

Author(s):

Qi Shen ◽

Qing Ming Song ◽

Bo Chen

Keyword(s):

Information Extraction ◽

Extraction Method ◽

Structural Features ◽

Web Pages ◽

Cleaning Efficiency ◽

Extraction Technology ◽

Web Information Extraction ◽

Web Information ◽

Dynamic Web ◽

The Web

With the development of web technology, the use of dynamic web pages and the personalization of page contents become more and more popular. Currently, the information of page is protean and the structures of different pages are vastly different, the traditional thinking of web information extraction technology has been difficult to adapt to the situation. In this paper, proposes a web information extraction method based on extended XPath policy through the analysis of structural features of web pages on tourist theme. This algorithm avoids the defects of traditional web information extraction technology; it is simple, practical, high cleaning efficiency, accuracy, and saving the overhead of the system.

Download Full-text

Discovery of Language Resources on the Web: Information Extraction from Heterogeneous Documents

Literary and Linguistic Computing ◽

10.1093/llc/fqm010 ◽

2007 ◽

Vol 22 (3) ◽

pp. 329-343

Author(s):

V. Pekar ◽

R. Evans

Keyword(s):

Information Extraction ◽

Language Resources ◽

Web Information Extraction ◽

Web Information ◽

The Web

Download Full-text

Web Information Extraction via Web Views

End-User Computing ◽

10.4018/978-1-59904-945-8.ch019 ◽

2008 ◽

pp. 211-238

Author(s):

Wee Keong Ng ◽

Zehua Liu ◽

Zhao Li ◽

Ee Peng Lim

Keyword(s):

Information Extraction ◽

Data Model ◽

Information Source ◽

Extraction Process ◽

Web Pages ◽

Efficient Manner ◽

Web Information Extraction ◽

Web Information ◽

Definition Of ◽

The Web

With the explosion of information on the Web, traditional ways of browsing and keyword searching of information over web pages no longer satisfy the demanding needs of web surfers. Web information extraction has emerged as an important research area that aims to automatically extract information from target web pages and convert them into a structured format for further processing. The main issues involved in the extraction process include: (1) the definition of a suitable extraction language; (2) the definition of a data model representing the web information source; (3) the generation of the data model, given a target source; and (4) the extraction and presentation of information according to a given data model. In this chapter, we discuss the challenges of these issues and the approaches that current research activities have taken to revolve these issues. We propose several classification schemes to classify existing approaches of information extraction from different perspectives. Among the existing works, we focus on the Wiccap system — a software system that enables ordinary end-users to obtain information of interest in a simple and efficient manner by constructing personalized web views of information sources.

Download Full-text

To Improve the Web Personalization using the Boosted Random Forest for Web Information Extraction

Recent Patents on Computer Science ◽

10.2174/2213275912666190307164623 ◽

2019 ◽

Vol 12 ◽

Author(s):

P. Srinivasa Rao ◽

D. Vasumathi

Keyword(s):

Random Forest ◽

Information Extraction ◽

Web Personalization ◽

Web Information Extraction ◽

Web Information ◽

The Web

Download Full-text