Mining Historical XML

Author(s):  
Qiankun Zhao ◽  
Sourav Saha Bhowmick

Nowadays the Web poses itself as the largest data repository ever available in the history of humankind (Reis et al., 2004). However, the availability of huge amount of Web data does not imply that users can get whatever they want more easily. On the contrary, the massive amount of data on the Web has overwhelmed their abilities to find the desired information. It has been claimed that 99% of the data reachable on the Web is useless to 99% of the users (Han & Kamber, 2000, pp. 436). That is, an individual may be interested in only a tiny fragment of the Web data. However, the huge and diverse properties of Web data do imply that Web data provides a rich and unprecedented data mining source.

Author(s):  
Ahmed El Azab ◽  
Mahmood A. Mahmood ◽  
Abd El-Aziz

Web usage mining techniques and applications across industries is still exploratory and, despite an increase in academic research, there are challenge of analyze web which quantitatively capture web users' common interests and characterize their underlying tasks. This chapter addresses the problem of how to support web usage mining techniques and applications across industries by combining language of web pages and algorithms that used in web data mining. Existing research in web usage mining techniques tend to focus on finding out how each techniques can apply in different industries fields. However, there is little evidence that researchers have approached the issue of web usage mining across industries. Consequently, the aim of this chapter is to provide an overview of how the web usage mining techniques and applications across industries can be supported.


2017 ◽  
Vol 7 (1.1) ◽  
pp. 286
Author(s):  
B. Sekhar Babu ◽  
P. Lakshmi Prasanna ◽  
P. Vidyullatha

 In current days, World Wide Web has grown into a familiar medium to investigate the new information, Business trends, trading strategies so on. Several organizations and companies are also contracting the web in order to present their products or services across the world. E-commerce is a kind of business or saleable transaction that comprises the transfer of statistics across the web or internet. In this situation huge amount of data is obtained and dumped into the web services. This data overhead tends to arise difficulties in determining the accurate and valuable information, hence the web data mining is used as a tool to determine and mine the knowledge from the web. Web data mining technology can be applied by the E-commerce organizations to offer personalized E-commerce solutions and better meet the desires of customers. By using data mining algorithm such as ontology based association rule mining using apriori algorithms extracts the various useful information from the large data sets .We are implementing the above data mining technique in JAVA and data sets are dynamically generated while transaction is processing and extracting various patterns.


Author(s):  
Lan-zhong Wang

The purpose of this study is to develop a distance personalized teaching platform. The web data mining is used for the construction of the system and by analyzing the character of web data mining (WDM) and the essence of personalization teaching and instruction, based on WDM, The system contains knowledge base, individual database, WDM and web server four modules. The web data mining is used for the construction of the system and by analyzing the character of web data mining (WDM) and the essence of personalization teaching and instruction. Simulation results show that model has important enlightenment and pushing effect for promoting the individual service and improving teaching quality of modern distance education.


2014 ◽  
Vol 1079-1080 ◽  
pp. 601-603
Author(s):  
Dan Yang

Popularity of the network is based on the transmission of information, with the development of electronic information technology, the degree of data in the information society continues to deepen, if we want to get wanted or useful information from the mass of information, it must be on the Web information mining. Before, information data used HTML language, its structure is poor, Web data mining is difficult to meet the needs of the job search. In this context, XML language emerges, and it has a good level and structure, and can organize web pages information better, plays a good role in data mining, largely changing the various deficiencies in HTML language. This paper first introduces XML and Web data mining, and analyzes XML-based Web data mining applications on this basis.


Author(s):  
Athena Vakali ◽  
Geroge Pallis ◽  
Lefteris Angelis

The explosive growth of the Web scale has drastically increased information circulation and dissemination rates. As the number of both Web users and Web sources grows significantly everyday, crucial data management issues, such as clustering on the Web, should be addressed and analyzed. Clustering has been proposed towards improving both the information availability and the Web users’ personalization. Clusters on the Web are either users’ sessions or Web information sources, which are managed in a variation of applications and implementations testbeds. This chapter focuses on the topic of clustering information over the Web, in an effort to overview and survey on the theoretical background and the adopted practices of most popular emerging and challenging clustering research efforts. An up-to-date survey of the existing clustering schemes is given, to be of use for both researchers and practitioners interested in the area of Web data mining.


Web Mining ◽  
2011 ◽  
pp. 119-144
Author(s):  
Neil C. Rowe

We survey research on using captions in data mining from the Web. Captions are text that describes some other information (typically, multimedia). Since text is considerably easier to analyze than non-text, a good way to support access to non-text is to index the words of its captions. However, captions vary considerably in form and content on the Web. We discuss the range of syntactic clues (such as HTML tags) and semantic clues (such as particular words). We discuss how to quantify clue strength and combine clues for a consensus. We then discuss the problem of mapping information in captions to information in media objects. While it is hard, classes of mapping schemes are distinguishable, and a segmentation of the media can be matched to a parse of the caption.


Author(s):  
Rafael Berlanga ◽  
Victoria Nebot

This chapter describes the convergence of two influential technologies in the last decade, namely data mining (DM) and the Semantic Web (SW). The wide acceptance of new SW formats for describing semantics-aware and semistructured contents have spurred on the massive generation of semantic annotations and large-scale domain ontologies for conceptualizing their concepts. As a result, a huge amount of both knowledge and semantic-annotated data is available in the web. DM methods have been very successful in discovering interesting patterns which are hidden in very large amounts of data. However, DM methods have been largely based on simple and flat data formats which are far from those available in the SW. This chapter reviews and discusses the main DM approaches proposed so far to mine SW data as well as those that have taken into account the SW resources and tools to define semantics-aware methods.


Author(s):  
Valerio Veglio

Companies have realized that the customer knowledge contained in web marketing database represent one of the main key to forecast business performance in today's competitive landscape. Appropriate web data mining models are one the best supporting approach to make different marketing decision. Analysing and understanding in advance customers' behaviour can represent the main corporation's strength in planning marketing forecasting. This research want to demonstrate as predictive web data mining models are accurate patterns in predicting marketing performance compared to traditional statistical methods in global business. In addition, particular attention is paid on the identification of the main marketing drivers performed by potential customers before purchasing a given service online. Finally, the criteria based on the loss functions confirm the high predictive power of the web data mining models in detecting the probability of customer conversion.


Data Mining ◽  
2013 ◽  
pp. 625-649
Author(s):  
Rafael Berlanga ◽  
Victoria Nebot

This chapter describes the convergence of two influential technologies in the last decade, namely data mining (DM) and the Semantic Web (SW). The wide acceptance of new SW formats for describing semantics-aware and semistructured contents have spurred on the massive generation of semantic annotations and large-scale domain ontologies for conceptualizing their concepts. As a result, a huge amount of both knowledge and semantic-annotated data is available in the web. DM methods have been very successful in discovering interesting patterns which are hidden in very large amounts of data. However, DM methods have been largely based on simple and flat data formats which are far from those available in the SW. This chapter reviews and discusses the main DM approaches proposed so far to mine SW data as well as those that have taken into account the SW resources and tools to define semantics-aware methods.


2014 ◽  
Vol 687-691 ◽  
pp. 3003-3006
Author(s):  
Pu Wang

At present, the growth of the Internet has brought us a vast amount of information that we can hardly deal with. To solve the flood of information, various data mining systems have been created to assist and augment this natural social process. Data minig recommender systems have been developed to automate the recommendation process. Data mining recommender systems can be found at many electronic commerce applications. In this paper, a recommendation mechanism of web data mining in electronic commerce application is given. Then, presents the workflow of the web data mining in electronic commcer. Lastly, the usage of the tools of web data mining is described.


Sign in / Sign up

Export Citation Format

Share Document