Web Data Extraction Techniques and Applications Using the Extensible Markup Language (XML)

<p>Students’ Information System (SIS) in Universiti Sultan Zainal Abidin (UniSZA) handles thousands of records on the information of students, subject registration, etc. Efficiency of storage and query retrieval of these records is the matter of database management especially involving with huge data. However, the execution time for storing and retrieving these data are still considerably inefficient due to several factors. In this contribution, two database approaches namely Extensible Markup Language (XML) and JavaScript Object Notation (JSON) were investigated to evaluate their suitability for handling thousands records in SIS. The results showed JSON is the best choice for storage and query speed. These are essential to cope with the characteristics of students’ data. Whilst, XML and JSON technologies are relatively new to date in comparison to the relational database. Indeed, JSON technology demonstrates greater potential to become a key database technology for handling huge data due to an increase of data annually.</p>

Download Full-text

Web Harvesting

Advances in Data Mining and Database Management - Web Usage Mining Techniques and Applications Across Industries ◽

10.4018/978-1-5225-0613-3.ch014 ◽

2017 ◽

pp. 351-378 ◽

Cited By ~ 1

Author(s):

B. Umamageswari ◽

R. Kalpana

Keyword(s):

Web Mining ◽

Data Extraction ◽

Web Pages ◽

Extraction Techniques ◽

Web Data ◽

Web Data Extraction ◽

Region Extraction ◽

Universal Applicability ◽

Web Harvesting

Web mining is done on huge amounts of data extracted from WWW. Many researchers have developed several state-of-the-art approaches for web data extraction. So far in the literature, the focus is mainly on the techniques used for data region extraction. Applications which are fed with the extracted data, require fetching data spread across multiple web pages which should be crawled automatically. For this to happen, we need to extract not only data regions, but also the navigation links. Data extraction techniques are designed for specific HTML tags; which questions their universal applicability for carrying out information extraction from differently formatted web pages. This chapter focuses on various web data extraction techniques available for different kinds of data rich pages, classification of web data extraction techniques and comparison of those techniques across many useful dimensions.

Download Full-text

ETL Using Web Data Extraction Techniques

10.1007/springerreference_64872 ◽

2011 ◽

Keyword(s):

Data Extraction ◽

Extraction Techniques ◽

Web Data ◽

Web Data Extraction

Download Full-text

Web Harvesting

The Dark Web ◽

10.4018/978-1-5225-3163-0.ch010 ◽

2018 ◽

pp. 199-226 ◽

Cited By ~ 1

Author(s):

B. Umamageswari ◽

R. Kalpana

Keyword(s):

Web Mining ◽

Data Extraction ◽

Web Pages ◽

Extraction Techniques ◽

Web Data ◽

Web Data Extraction ◽

Region Extraction ◽

Universal Applicability ◽

Web Harvesting

Web mining is done on huge amounts of data extracted from WWW. Many researchers have developed several state-of-the-art approaches for web data extraction. So far in the literature, the focus is mainly on the techniques used for data region extraction. Applications which are fed with the extracted data, require fetching data spread across multiple web pages which should be crawled automatically. For this to happen, we need to extract not only data regions, but also the navigation links. Data extraction techniques are designed for specific HTML tags; which questions their universal applicability for carrying out information extraction from differently formatted web pages. This chapter focuses on various web data extraction techniques available for different kinds of data rich pages, classification of web data extraction techniques and comparison of those techniques across many useful dimensions.

Download Full-text