Analytical Techniques Help Enhance the Results of Data Mining: Why Filtering Out Higher Harmonics Makes It Easier to Carry a Tune

Author(s):  
Griselda Acosta ◽  
Eric Smith ◽  
Vladik Kreinovich
Web Services ◽  
2019 ◽  
pp. 618-638
Author(s):  
Goran Klepac ◽  
Kristi L. Berg

This chapter proposes a new analytical approach that consolidates the traditional analytical approach for solving problems such as churn detection, fraud detection, building predictive models, segmentation modeling with data sources, and analytical techniques from the big data area. Presented are solutions offering a structured approach for the integration of different concepts into one, which helps analysts as well as managers to use potentials from different areas in a systematic way. By using this concept, companies have the opportunity to introduce big data potential in everyday data mining projects. As is visible from the chapter, neglecting big data potentials results often with incomplete analytical results, which imply incomplete information for business decisions and can imply bad business decisions. The chapter also provides suggestions on how to recognize useful data sources from the big data area and how to analyze them along with traditional data sources for achieving more qualitative information for business decisions.


Author(s):  
Karim K. Hirji

In contrast to the Industrial Revolution, the Digital Revolution is happening much more quickly. For example, in 1946, the world’s first programmable computer, the Electronic Numerical Integrator and Computer (ENIAC), stood 10 feet tall, stretched 150 feet wide, cost millions of dollars, and could execute up to 5,000 operations per second. Twenty- five years later, Intel packed 12 times ENIAC’s processing power into a 12–square-millimeter chip. Today’s personal computers with Pentium processors perform in excess of 400 million instructions per second. Database systems, a subfield of computer science, has also met with notable accelerated advances. A major strength of database systems is their ability to store volumes of complex, hierarchical, heterogeneous, and time-variant data and to provide rapid access to information while correctly capturing and reflecting database updates. Together with the advances in database systems, our relationship with data has evolved from the prerelational and relational period to the data-warehouse period. Today, we are in the knowledge-discovery and data-mining (KDDM) period where the emphasis is not so much on identifying ways to store data or on consolidating and aggregating data to provide a single, unified perspective. Rather, the emphasis of KDDM is on sifting through large volumes of historical data for new and valuable information that will lead to competitive advantage. The evolution to KDDM is natural since our capabilities to produce, collect, and store information have grown exponentially. Debit cards, electronic banking, e-commerce transactions, the widespread introduction of bar codes for commercial products, and advances in both mobile technology and remote sensing data-capture devices have all contributed to the mountains of data stored in business, government, and academic databases. Traditional analytical techniques, especially standard query and reporting and online analytical processing, are ineffective in situations involving large amounts of data and where the exact nature of information one wishes to extract is uncertain. Data mining has thus emerged as a class of analytical techniques that go beyond statistics and that aim at examining large quantities of data; data mining is clearly relevant for the current KDDM period. According to Hirji (2001), data mining is the analysis and nontrivial extraction of data from databases for the purpose of discovering new and valuable information, in the form of patterns and rules, from relationships between data elements. Data mining is receiving widespread attention in the academic and public press literature (Berry & Linoff, 2000; Fayyad, Piatetsky-Shapiro, & Smyth, 1996; Kohavi, Rothleder, & Simoudis, 2002; Newton, Kendziorski, Richmond, & Blattner, 2001; Venter, Adams, & Myers, 2001; Zhang, Wang, Ravindranathan, & Miles, 2002), and case studies and anecdotal evidence to date suggest that organizations are increasingly investigating the potential of data-mining technology to deliver competitive advantage.


2011 ◽  
pp. 1323-1331
Author(s):  
Jeffrey W. Seifert

A significant amount of attention appears to be focusing on how to better collect, analyze, and disseminate information. In doing so, technology is commonly and increasingly looked upon as both a tool, and, in some cases, a substitute, for human resources. One such technology that is playing a prominent role in homeland security initiatives is data mining. Similar to the concept of homeland security, while data mining is widely mentioned in a growing number of bills, laws, reports, and other policy documents, an agreed upon definition or conceptualization of data mining appears to be generally lacking within the policy community (Relyea, 2002). While data mining initiatives are usually purported to provide insightful, carefully constructed analysis, at various times data mining itself is alternatively described as a technology, a process, and/or a productivity tool. In other words, data mining, or factual data analysis, or predictive analytics, as it also is sometimes referred to, means different things to different people. Regardless of which definition one prefers, a common theme is the ability to collect and combine, virtually if not physically, multiple data sources, for the purposes of analyzing the actions of individuals. In other words, there is an implicit belief in the power of information, suggesting a continuing trend in the growth of “dataveillance,” or the monitoring and collection of the data trails left by a person’s activities (Clarke, 1988). More importantly, it is clear that there are high expectations for data mining, or factual data analysis, being an effective tool. Data mining is not a new technology but its use is growing significantly in both the private and public sectors. Industries such as banking, insurance, medicine, and retailing commonly use data mining to reduce costs, enhance research, and increase sales. In the public sector, data mining applications initially were used as a means to detect fraud and waste, but have grown to also be used for purposes such as measuring and improving program performance. While not completely without controversy, these types of data mining applications have gained greater acceptance. However, some national defense/homeland security data mining applications represent a significant expansion in the quantity and scope of data to be analyzed. Moreover, due to their security-related nature, the details of these initiatives (e.g., data sources, analytical techniques, access and retention practices, etc.) are usually less transparent.


Author(s):  
J. W. Seifert

A significant amount of attention appears to be focusing on how to better collect, analyze, and disseminate information. In doing so, technology is commonly and increasingly looked upon as both a tool, and, in some cases, a substitute, for human resources. One such technology that is playing a prominent role in homeland security initiatives is data mining. Similar to the concept of homeland security, while data mining is widely mentioned in a growing number of bills, laws, reports, and other policy documents, an agreed upon definition or conceptualization of data mining appears to be generally lacking within the policy community (Relyea, 2002). While data mining initiatives are usually purported to provide insightful, carefully constructed analysis, at various times data mining itself is alternatively described as a technology, a process, and/or a productivity tool. In other words, data mining, or factual data analysis, or predictive analytics, as it also is sometimes referred to, means different things to different people. Regardless of which definition one prefers, a common theme is the ability to collect and combine, virtually if not physically, multiple data sources, for the purposes of analyzing the actions of individuals. In other words, there is an implicit belief in the power of information, suggesting a continuing trend in the growth of “dataveillance,” or the monitoring and collection of the data trails left by a person’s activities (Clarke, 1988). More importantly, it is clear that there are high expectations for data mining, or factual data analysis, being an effective tool. Data mining is not a new technology but its use is growing significantly in both the private and public sectors. Industries such as banking, insurance, medicine, and retailing commonly use data mining to reduce costs, enhance research, and increase sales. In the public sector, data mining applications initially were used as a means to detect fraud and waste, but have grown to also be used for purposes such as measuring and improving program performance. While not completely without controversy, these types of data mining applications have gained greater acceptance. However, some national defense/homeland security data mining applications represent a significant expansion in the quantity and scope of data to be analyzed. Moreover, due to their security-related nature, the details of these initiatives (e.g., data sources, analytical techniques, access and retention practices, etc.) are usually less transparent.


Author(s):  
Zu-Hsu Lee ◽  
Richard L. Peterson ◽  
Chen-Fu Chien ◽  
Ruben Xing

The rapid growth and advances of information technology enable data to be accumulated faster and in much larger quantities (i.e., data warehousing). Faced with vast new information resources, scientists, engineers, and business people need efficient analytical techniques to extract useful information and effectively uncover new, valuable knowledge patterns.


High volumes and varieties of data is piling every day from healthcare and related fields. This big data sources if managed and analysed properly will provide vital knowledge. Data mining and data analytics have been playing an important role in extracting useful information from healthcare and related data sources. The knowledge extracted from these data sources guiding patients and healthcare personnel towards improved health conditions. Analytical techniques from statistics, functionalities from data mining and machine learning already proved their capability with significant contributions to healthcare industry. The dominant functionality of data mining is classification which has been in use in mining healthcare data. Though classification is a good learning technique it may not provide a causation model which will be a reliable model for better decision making particularly in the medical field. The present models for causality have limitations in terms of scalability and reliability. The present study is targeted to study causal models for causal relationship mining. This study tried to conclude with some proposals for causal relationship discovery which are efficient, reliable and scalable. The proposed model is going to make use of some qualities of decision trees along with statistical tests and analytics. It is proposed to build the learning models on healthcare big data sources.


Author(s):  
Stephan Kudyba ◽  
Richard Hoptroff

Over the years, the term data mining has been connected to various types of analytical approaches. In fact, just a few years ago, let’s say prior to 1995, many individuals in the software industry and business users as well, often referred to OLAP as a main component of data mining technology. More recently however, this term has taken on a new meaning and one which will most likely prevail for years to come. As we mentioned in the previous chapter, data mining technology encompasses such methodologies as clustering, classification and segmentation, association, neural networks and regression as the main players in this space. Other analytical processes which are related to mining, as defined in this work, include such methodologies as Linear Programming, Monte Carlo analysis and Bayesian methodologies. In fact, depending on who you ask, these techniques may actually be considered part of the data mining spectrum since they are grounded in mathematical techniques applied to historical data. The focus of this work however, revolves around the former more core approaches. Regardless of the type of methodology, data mining has taken its roots from traditional analytical techniques. Enhancements in computer processing, (e.g., speed and processing power) has enabled a wider diffusion of more complex techniques to become more automated and user friendly and have evolved to the state of our current data mining.


Author(s):  
Goran Klepac ◽  
Kristi L. Berg

This chapter proposes a new analytical approach that consolidates the traditional analytical approach for solving problems such as churn detection, fraud detection, building predictive models, segmentation modeling with data sources, and analytical techniques from the big data area. Presented are solutions offering a structured approach for the integration of different concepts into one, which helps analysts as well as managers to use potentials from different areas in a systematic way. By using this concept, companies have the opportunity to introduce big data potential in everyday data mining projects. As is visible from the chapter, neglecting big data potentials results often with incomplete analytical results, which imply incomplete information for business decisions and can imply bad business decisions. The chapter also provides suggestions on how to recognize useful data sources from the big data area and how to analyze them along with traditional data sources for achieving more qualitative information for business decisions.


2020 ◽  
Vol 8 (2) ◽  
pp. 602-609
Author(s):  
Vetukuri Sivaramaraju ◽  
Nilambar Sethi ◽  
Renugunta Rajender

Cricket is popularly known as the game of gentlemen. The game of cricket has been introduced to the World by England. Since the introduction till date, it has become the second most ever popular game. In this context, few a data mining and analytical techniques have been proposed for the same. In this work, two different scenario have been considered for the prediction of winning team based on several parameters. These scenario are taken for two different standard formats for the game namely, one day international (ODI) cricket and twenty-twenty cricket (T-20). The prediction approaches differ from each other based on the types of parameters considered and the corresponding functional strategies. The strategies proposed here adopts two different approaches. One approach is for the winner prediction for one-day matches and the other is for predicting the winner for a T-20 match. The approaches have been proposed separately for both the versions of the game pertaining to the intra-variability in the strategies adopted by a team and individuals for each. The proposed strategies for each of the two scenarios have been individually evaluated against existing benchmark works, and for each of the cases the duo of approaches have outperformed the rest in terms of the prediction accuracy. The novel heuristics proposed herewith reflects efficiency and accuracy with respect to prediction of cricket data.


Sign in / Sign up

Export Citation Format

Share Document