Data Mining and the Web

CASPUR allows many academic Italian institutions located in the Centre-South of Italy to access more than 7 million articles through a digital library platform. The behaviour of its users were analyzed by considering their “traces”, which are stored in the web server log file. Using several web mining and data mining techniques the author discovered a gradual and dynamic change in the way articles are accessed. In particular there is evidence of a journal browsing increase in comparison to the searching mode. Such phenomenon were interpreted using the idea that browsing better meets the needs of users when they want to keep abreast about the latest advances in their scientific field, in comparison to a more generic searching inside the digital library.

Download Full-text

Bayesian data mining on the Web with B-Course

Proceedings 2001 IEEE International Conference on Data Mining ◽

10.1109/icdm.2001.989584 ◽

2002 ◽

Cited By ~ 5

Author(s):

P. Myllymaki ◽

T. Silander ◽

H. Tirri ◽

P. Uronen

Keyword(s):

Data Mining ◽

The Web

Download Full-text

Privacy Preserving Data Mining Services on the Web

Trust, Privacy, and Security in Digital Business - Lecture Notes in Computer Science ◽

10.1007/11537878_25 ◽

2005 ◽

pp. 246-255 ◽

Cited By ~ 1

Author(s):

Ayça Azgın Hintoğlu ◽

Yücel Saygın ◽

Salima Benbernou ◽

Mohand Said Hacid

Keyword(s):

Data Mining ◽

Privacy Preserving ◽

Privacy Preserving Data Mining ◽

The Web

Download Full-text

Mining the Web to Add Semantics to Retail Data Mining

Web Mining: From Web to Semantic Web - Lecture Notes in Computer Science ◽

10.1007/978-3-540-30123-3_3 ◽

2004 ◽

pp. 43-56

Author(s):

Rayid Ghani

Keyword(s):

Data Mining ◽

The Web

Download Full-text

Web Usage Mining and the Challenge of Big Data

Big Data ◽

10.4018/978-1-4666-9840-6.ch042 ◽

2016 ◽

pp. 899-928

Author(s):

Abubakr Gafar Abdalla ◽

Tarig Mohamed Ahmed ◽

Mohamed Elhassan Seliaman

Keyword(s):

Data Mining ◽

Pattern Discovery ◽

Web Usage Mining ◽

Data Mining Techniques ◽

Web Log ◽

Web Usage ◽

Web Logs ◽

Usage Patterns ◽

Rich Data ◽

The Web

The web is a rich data mining source which is dynamic and fast growing, providing great opportunities which are often not exploited. Web data represent a real challenge to traditional data mining techniques due to its huge amount and the unstructured nature. Web logs contain information about the interactions between visitors and the website. Analyzing these logs provides insights into visitors' behavior, usage patterns, and trends. Web usage mining, also known as web log mining, is the process of applying data mining techniques to discover useful information hidden in web server's logs. Web logs are primarily used by Web administrators to know how much traffic they get and to detect broken links and other types of errors. Web usage mining extracts useful information that can be beneficial to a number of application areas such as: web personalization, website restructuring, system performance improvement, and business intelligence. The Web usage mining process involves three main phases: pre-processing, pattern discovery, and pattern analysis. Various preprocessing techniques have been proposed to extract information from log files and group primitive data items into meaningful, lighter level abstractions that are suitable for mining, usually in forms of visitors' sessions. Major data mining techniques in web usage mining pattern discovery are: clustering, association analysis, classification, and sequential patterns discovery. This chapter discusses the process of web usage mining, its procedure, methods, and patterns discovery techniques. The chapter also presents a practical example using real web log data.

Download Full-text

Big Data and Privacy State of the Art

Advances in Computational Intelligence and Robotics - Advanced Metaheuristic Methods in Big Data Retrieval and Analytics ◽

10.4018/978-1-5225-7338-8.ch006 ◽

2019 ◽

pp. 104-158

Author(s):

Amine Rahmani

Keyword(s):

Data Mining ◽

Big Data ◽

Exponential Growth ◽

Scientific Community ◽

Ethical Issues ◽

Massive Data ◽

Medium Term ◽

Access To Data ◽

Fast Access ◽

The Web

The phenomenon of big data (massive data mining) refers to the exponential growth of the volume of data available on the web. This new concept has become widely used in recent years, enabling scalable, efficient, and fast access to data anytime, anywhere, helping the scientific community and companies identify the most subtle behaviors of users. However, big data has its share of the limits of ethical issues and risks that cannot be ignored. Indeed, new risks in terms of privacy are just beginning to be perceived. Sometimes simply annoying, these risks can be really harmful. In the medium term, the issue of privacy could become one of the biggest obstacles to the growth of big data solutions. It is in this context that a great deal of research is under way to enhance security and develop mechanisms for the protection of privacy of users. Although this area is still in its infancy, the list of possibilities continues to grow.

Download Full-text

Big Data and Privacy State of the Art

Research Anthology on Blockchain Technology in Business, Healthcare, Education, and Government ◽

10.4018/978-1-7998-5351-0.ch054 ◽

2021 ◽

pp. 947-991

Author(s):

Amine Rahmani

Keyword(s):

Data Mining ◽

Big Data ◽

Exponential Growth ◽

Scientific Community ◽

Ethical Issues ◽

Massive Data ◽

Medium Term ◽

Access To Data ◽

Fast Access ◽

The Web

The phenomenon of big data (massive data mining) refers to the exponential growth of the volume of data available on the web. This new concept has become widely used in recent years, enabling scalable, efficient, and fast access to data anytime, anywhere, helping the scientific community and companies identify the most subtle behaviors of users. However, big data has its share of the limits of ethical issues and risks that cannot be ignored. Indeed, new risks in terms of privacy are just beginning to be perceived. Sometimes simply annoying, these risks can be really harmful. In the medium term, the issue of privacy could become one of the biggest obstacles to the growth of big data solutions. It is in this context that a great deal of research is under way to enhance security and develop mechanisms for the protection of privacy of users. Although this area is still in its infancy, the list of possibilities continues to grow.

Download Full-text

Web Usage Mining Issues in Big Data

Impacts and Challenges of Cloud Business Intelligence - Advances in Systems Analysis, Software Engineering, and High Performance Computing ◽

10.4018/978-1-7998-5040-3.ch007 ◽

2021 ◽

pp. 102-112

Author(s):

Sunny Sharma ◽

Manisha Malhotra

Keyword(s):

Data Mining ◽

Big Data ◽

User Behavior ◽

Web Usage Mining ◽

Web Personalization ◽

Data Mining Techniques ◽

Meaningful Information ◽

Web Usage ◽

Use Of Data ◽

The Web

Web usage mining is the use of data mining techniques to analyze user behavior in order to better serve the needs of the user. This process of personalization uses a set of techniques and methods for discovering the linking structure of information on the web. The goal of web personalization is to improve the user experience by mining the meaningful information and presented the retrieved information in a way the user intends. The arrival of big data instigated novel issues to the personalization community. This chapter provides an overview of personalization, big data, and identifies challenges related to web personalization with respect to big data. It also presents some approaches and models to fill the gap between big data and web personalization. Further, this research brings additional opportunities to web personalization from the perspective of big data.

Download Full-text

Effectiveness of Web Usage Mining Techniques in Business Application

Advances in Data Mining and Database Management - Web Usage Mining Techniques and Applications Across Industries ◽

10.4018/978-1-5225-0613-3.ch013 ◽

2017 ◽

pp. 324-350 ◽

Cited By ~ 2

Author(s):

Ahmed El Azab ◽

Mahmood A. Mahmood ◽

Abd El-Aziz

Keyword(s):

Data Mining ◽

Academic Research ◽

Web Usage Mining ◽

Web Pages ◽

Web Data ◽

Web Data Mining ◽

Web Usage ◽

Business Application ◽

Common Interests ◽

The Web

Web usage mining techniques and applications across industries is still exploratory and, despite an increase in academic research, there are challenge of analyze web which quantitatively capture web users' common interests and characterize their underlying tasks. This chapter addresses the problem of how to support web usage mining techniques and applications across industries by combining language of web pages and algorithms that used in web data mining. Existing research in web usage mining techniques tend to focus on finding out how each techniques can apply in different industries fields. However, there is little evidence that researchers have approached the issue of web usage mining across industries. Consequently, the aim of this chapter is to provide an overview of how the web usage mining techniques and applications across industries can be supported.

Download Full-text

Depressive Person Detection using Social Asian Elephants' (SAE) Algorithm over Twitter Posts

International Journal of Organizational and Collective Intelligence ◽

10.4018/ijoci.2019100103 ◽

2019 ◽

Vol 9 (4) ◽

pp. 37-51

Author(s):

Hadj Ahmed Bouarara

Keyword(s):

Data Mining ◽

Social Network ◽

Decision Tree ◽

Social Life ◽

Naive Bayes ◽

Data Sources ◽

Asian Elephants ◽

Person Detection ◽

The Social ◽

The Web

With the advent of the web and the explosion of data sources such as opinion sites, blogs and microblogs appeared the need to analyze millions of posts, tweets or opinions in order to find out what thinks the net surfers. The idea was to produce a new algorithm inspired by the social life of Asian elephants to detect a person in depressive situation through the analysis of twitter social network. The proposal algorithm gives better performance compared to data mining and bioinspired techniques such as naive Bayes, decision tree, heart lungs algorithm, social cockroach's algorithm.

Download Full-text