Critical and Future Trends in Data Mining

Data Mining ◽  
2011 ◽  
pp. 437-452 ◽  
Author(s):  
Jeffrey Hsu

Every day, enormous amounts of information are generated from all sectors, whether it be business, education, the scientific community, the World Wide Web (WWW), or one of many readily available off-line and online data sources. From all of this, which represents a sizable repository of data and information, it is possible to generate worthwhile and usable knowledge. As a result, the field of Data Mining (DM) and knowledge discovery in databases (KDD) has grown in leaps and bounds and has shown great potential for the future (Han & Kamber, 2001). The purpose of this chapter is to survey many of the critical and future trends in the field of DM, with a focus on those which are thought to have the most promise and applicability to future DM applications.

Web Mining ◽  
2011 ◽  
pp. 69-98 ◽  
Author(s):  
Roberto Navigli

Domain ontologies are widely recognized as a key element for the so-called semantic Web, an improved, “semantic aware” version of the World Wide Web. Ontologies define concepts and interrelationships in order to provide a shared vision of a given application domain. Despite the significant amount of work in the field, ontologies are still scarcely used in Web-based applications. One of the main problems is the difficulty in identifying and defining relevant concepts within the domain. In this chapter, we provide an approach to the problem, defining a method and a tool, OntoLearn, aimed at the extraction of knowledge from Websites, and more generally from documents shared among the members of virtual organizations, to support the construction of a domain ontology. Exploiting the idea that a corpus of documents produced by a community is the most representative (although implicit) repository of concepts, the method extracts a terminology, provides a semantic interpretation of relevant terms and populates the domain ontology in an automatic manner. Finally, further manual corrections are required from domain experts in order to achieve a rich and usable knowledge resource.


Author(s):  
Dan Zhu

With the advent of technology, information is available in abundance on the World Wide Web. In order to have appropriate and useful information users must increasingly use techniques and automated tools to search, extract, filter, analyze and evaluate desired information and resources. Data mining can be defined as the extraction of implicit, previously unknown, and potentially useful information from large databases. On the other hand, text mining is the process of extracting the information from an unstructured text. A standard text mining approach will involve categorization of text, text clustering, and extraction of concepts, granular taxonomies production, sentiment analysis, document summarization, and modeling (Fan et al, 2006). Furthermore, Web mining is the discovery and analysis of useful information using the World Wide Web (Berry, 2002; Mobasher, 2007). This broad definition encompasses “web content mining,” the automated search for resources and retrieval of information from millions of websites and online databases, as well as “web usage mining,” the discovery and analysis of users’ website navigation and online service access patterns. Companies are investing significant amounts of time and money on creating, developing, and enhancing individualized customer relationship, a process called customer relationship management or CRM. Based on a report by the Aberdeen Group, worldwide CRM spending reached close to $20 billion by 2006. Today, to improve the customer relationship, most companies collect and refine massive amounts of data available through the customers. To increase the value of current information resources, data mining techniques can be rapidly implemented on existing software and hardware platforms, and integrated with new products and systems (Wang et al., 2008). If implemented on high-performance client/server or parallel processing computers, data mining tools can analyze enormous databases to answer customer-centric questions such as, “Which clients have the highest likelihood of responding to my next promotional mailing, and why.” This paper provides a basic introduction to data mining and other related technologies and their applications in CRM.


Author(s):  
David R. Danielson

Credibility evaluation processes on the World Wide Web are subject to a number of unique selective pressures. The Web’s potential for supplying timely, accurate, and comprehensive information contrasts with its lack of centralized quality control mechanisms, resulting in its simultaneous potential for doing more harm than good to information seekers. Web users must balance the problems and potentials of accepting Web content and do so in an environment for which traditional, familiar ways of evaluating credibility do not always apply. Web credibility research aims to better understand this delicate balance and the resulting evaluation processes employed by Web users. This article reviews credibility conceptualizations utilized in the field, unique characteristics of the Web relevant to credibility, theoretical perspectives on Web credibility evaluation processes, factors influencing Web credibility assessments, and future trends.


2018 ◽  
Vol 49 (1) ◽  
pp. 29-31
Author(s):  
Timo Prusti

Gaia is an operational satellite in the ESA science programme. It is gathering data for more than a billion objects. Gaia measures positions and motions of stars in our Milky Way Galaxy, but captures many asteroids and extragalactic sources as well. The first data release has already been made and exploitation by the world-wide scientific community is underway. Further data releases will be made with further increasing accuracy. Gaia is well underway to provide its promised set of fundamental astronomical data.


2005 ◽  
Vol 18 (2) ◽  
pp. 46-66
Author(s):  
Heiner M. Fangerau

During the 1920s, the world-wide eugenics movement reached a peak level of popularity. Historians have stressed the key role of the textbook “Human Heredity and Racial Hygiene” in the popularisation of eugenic thinking in Germany. In this textbook the well known scientists Erwin Baur (1875-1933), Eugen Fischer (1874-1967) and Fritz Lenz (1887-1976) tried to combine genetics, anthropology and racial hygiene to form a “Magna Carta” of eugenics. This paper aims at quantitatively reconstructing the book’s development into a standard work. 325 contemporary reviews of the book were analysed. More than 80% of the reviewers evaluated the book positively recommending it to a variety of readers. Most of the reviewers were Medical Doctors concentrating on the eugenic aspects of the book. The reception study makes the reciprocity of eugenics as an accepted science and academics forming it into science prevalent. Explanations for the uniform reaction of the scientific community are discussed. *Key words*: reception study, interwar years, eugenics


2008 ◽  
pp. 2688-2696
Author(s):  
Edilberto Casado

Business intelligence (BI) is a key topic in business today, since it is focused on strategic decision making and on the search of value from business activities through empowering a “forward-thinking” view of the world. From this perspective, one of the most valuable concepts within BI is the “knowledge discovery in databases” or “data mining,” defined as “the process of discovering meaningful new correlations, patterns, and trends by sifting through large amounts of data stored in repositories, using pattern recognition technologies as well as statistical and mathematical techniques” (SPSS, 1997).


Author(s):  
Christos Makris ◽  
Nikos Tsirakis

The World Wide Web has rapidly become the dominant Internet tool which has overwhelmed us with a combination of rich hypertext information, multimedia data and various resources of dynamic information. This evolution in conjunction with the immense amount of available information imposes the need of new computational methods and techniques in order to provide, in a systematical way, useful information among billions of Web pages. In other words, this situation poses great challenges for providing knowledge from Web-based information. The area of data mining has arisen over the last decade to address this type of issues. There are many methods, techniques and algorithms that accomplish different tasks in this area. All these efforts examine the data and try to find a model that fits to their characteristics in order to examine them. Data can be either typical information from files, databases and so forth, or with the form of a stream. Streams constitute a data model where information is an undifferentiated, byte-by-byte flow that passes over the time. The area of algorithms for processing data streams and associated applications has become an emerging area of interest, especially when all this is done over the Web. Generally, there are many data mining functions (Tan, Steinbach, & Kumar, 2006) that can be applied in data streams. Among them one can discriminate clustering, which belongs to the descriptive data mining models. Clustering is a useful and ubiquitous tool in data analysis.


Author(s):  
Yongjian Fu

With the rapid development of the World Wide Web or the Web, many organizations now put their information on the Web and provide Web-based services such as online shopping, user feedback, technical support, and so on. Understanding Web usage through data mining techniques is recognized as an important area.


Author(s):  
Dan Zhu

With the explosive growth of information available on the World Wide Web, users must increasingly use automated tools to find, extract, filter, and evaluate desired information and resources. Companies are investing significant amounts of time and money on creating, developing, and enhancing individualized customer relationships, a process called customer relationship management, or CRM (Berry & Linoff, 1999; Buttle, 2003; Rud, 2000). Based on a report by the Aberdeen Group, worldwide CRM spending reached $13.7 billion in 2002 and should be close to $20 billion by 2006.


Sign in / Sign up

Export Citation Format

Share Document