Research on Web Intelligent Information Extraction Method

Applied Mechanics and Materials ◽

10.4028/www.scientific.net/amm.539.464 ◽

2014 ◽

Vol 539 ◽

pp. 464-468

Author(s):

Zhi Min Wang

Keyword(s):

Information Extraction ◽

Extraction Method ◽

Experimental Results ◽

Accurate Information ◽

Web Page ◽

Page Segmentation ◽

Segmentation Technique ◽

Ontology Extraction ◽

Intelligent Information

The paper introduces segmentation ideas in the pretreatment process of web page. By page segmentation technique to extract the accurate information in the extract region, the region was processed to extract according to the rules of ontology extraction , and ultimately get the information you need. Through experiments on two real datasets and compare with related work, experimental results show that this method can achieve good extraction results.

Download Full-text

Web Page Segmentation Towards Information Extraction for Web Semantics

International Conference on Innovative Computing and Communications - Lecture Notes in Networks and Systems ◽

10.1007/978-981-13-2354-6_45 ◽

2018 ◽

pp. 431-442

Author(s):

Pooja Malhotra ◽

Sanjay Kumar Malik

Keyword(s):

Information Extraction ◽

Web Page ◽

Page Segmentation

Download Full-text

A novel web page text information extraction method

2019 IEEE 3rd Information Technology, Networking, Electronic and Automation Control Conference (ITNEC) ◽

10.1109/itnec.2019.8729329 ◽

2019 ◽

Cited By ~ 1

Author(s):

Chongjun Wang ◽

Peng Wei

Keyword(s):

Information Extraction ◽

Extraction Method ◽

Web Page ◽

Text Information

Download Full-text

An Approach of Web Page Information Extraction

Applied Mechanics and Materials ◽

10.4028/www.scientific.net/amm.347-350.2479 ◽

2013 ◽

Vol 347-350 ◽

pp. 2479-2482

Author(s):

Yao Hui Li ◽

Li Xia Wang ◽

Jian Xiong Wang ◽

Jie Yue ◽

Ming Zhan Zhao

Keyword(s):

Information Extraction ◽

Search Engine ◽

Information Source ◽

Web Pages ◽

Web Page ◽

Extraction Technology ◽

Page Segmentation ◽

The Web

The Web has become the largest information source, but the noise content is an inevitable part in any web pages. The noise content reduces the nicety of search engine and increases the load of server. Information extraction technology has been developed. Information extraction technology is mostly based on page segmentation. Through analyzed the existing method of page segmentation, an approach of web page information extraction is provided. The block node is identified by analyzing attributes of HTML tags. This algorithm is easy to implementation. Experiments prove its good performance.

Download Full-text

A Method for Ontology Module Extract

Applied Mechanics and Materials ◽

10.4028/www.scientific.net/amm.668-669.1198 ◽

2014 ◽

Vol 668-669 ◽

pp. 1198-1201

Author(s):

Hong Mei Zhu ◽

Liang Zhang ◽

Wei Sun

Keyword(s):

Semantic Web ◽

Data Structures ◽

Extraction Method ◽

Evaluation Methods ◽

Experimental Results ◽

Ontology Engineering ◽

Extraction Algorithm ◽

Ontology Extraction ◽

Ontology Module ◽

Run Time

In semantic Web, extensive reuse of existing large ontology is one of the central ideas of ontology engineering. Ontology extraction should return relative sub-ontology that covers some sub-vocabulary. The efficiency of the existing ontology extraction algorithm is relatively low when they try to get a suitable ontology module from ontology at run time. This paper proposed a kind of ontology module extraction method. Related concepts and criterions of ontology modules extraction are studied; data structures and identification and evaluation methods of ontology module extraction are discussed; preliminary experimental results and the corresponding analysis are also shown.

Download Full-text

Automatic web page segmentation and information extraction using conditional random fields

Proceedings of the 2012 IEEE 16th International Conference on Computer Supported Cooperative Work in Design (CSCWD) ◽

10.1109/cscwd.2012.6221840 ◽

2012 ◽

Cited By ~ 1

Author(s):

Yunfei Gong ◽

Qiang Liu

Keyword(s):

Information Extraction ◽

Random Fields ◽

Conditional Random Fields ◽

Web Page ◽

Page Segmentation

Download Full-text

A Unified Microblog Web Page Structured Information Extraction Method Based on Hierarchical Clustering

Applied Mechanics and Materials ◽

10.4028/www.scientific.net/amm.427-429.2489 ◽

2013 ◽

Vol 427-429 ◽

pp. 2489-2492 ◽

Cited By ~ 1

Author(s):

Tian Yu Zhao ◽

Jian Yi Liu ◽

Ru Zhang

Keyword(s):

Information Extraction ◽

Hierarchical Clustering ◽

Extraction Method ◽

High Performance ◽

Service Providers ◽

Web Pages ◽

Web Page ◽

The World ◽

Structured Information ◽

Rich Information

Rich information is contributed to microblogs by millions of users all around the world. However, few work has been done on the study of microblog web page extraction so far. We proposed a unified structured information extraction method based on hierarchical clustering which is suitable for microblog web pages of any microblog websites. The experiment result on microblog web pages of some popular microblog service providers indicates the high performance of our method.

Download Full-text