Putting Data Before Theory

Author(s):  
Gary Smith ◽  
Jay Cordes

The traditional statistical analysis of data follows what has come to be known as the scientific method: collecting reliable data to test plausible theories. Data mining goes in the other direction, analyzing data without being motivated or encumbered by theories. The fundamental problem with data mining is simple: We think that data patterns are unusual and therefore meaningful. Patterns are, in fact, inevitable and therefore meaningless. This is why data mining is not usually knowledge discovery, but noise discovery. Finding correlations is easy. Good data scientists are not seduced by discovered patterns because they don’t put data before theory. They do not commit Texas Sharpshooter Fallacies or fall into the Feynman Trap.

2002 ◽  
Vol 01 (02) ◽  
pp. 141-154
Author(s):  
Satheesh Ramachandran

This paper presents a framework for the integrated use of formal knowledge engineering methods and data mining based knowledge discovery methods. Knowledge is a key enterprise asset, and organizations are adopting both knowledge engineering and knowledge discovery paradigms for better knowledge management and enhanced decision support capability. Although there exists a useful interdependence between these endeavors, not much effort has been focused on using the full potential of one for the other. This paper presents a framework for the integrated use of established formal knowledge engineering methods and knowledge discovery processes with the ultimate intent of better managing the enterprise knowledge life cycle. It provides a brief overview of the knowledge discovery processes, and introduces a class of formal knowledge engineering methods and the perceived role of these methods in supporting the integration between the two worlds of knowledge discovery and knowledge engineering.


2021 ◽  
Vol 66 (3) ◽  
pp. 549-559
Author(s):  
Marcin Milewski ◽  
Karolina Milewska ◽  
Anna Justyna Milewska

Abstract Preventive vaccination is one of the greatest successes of modern medicine. The SARS-CoV-2 epidemic, during which vaccination is the main method of prevention against death and severe disease, gave rise to a resurgence of anti-vaccine (anti-vax) movements. The aim of this study was to analyse the attitudes of students towards vaccination and the COVID-19 pandemic. The statistical analysis was performed with the use of the following data-mining methods: correspondence analysis and basket analysis. The obtained results show that students of medicine are characterized with the highest level of knowledge. Students of other medical faculties, on the other hand, have a significantly less uniform views, as do students of non-medical faculties.


2017 ◽  
Vol 8 (3) ◽  
pp. 13-52 ◽  
Author(s):  
Dharmpal Singh

The main objective of this work is to develop an integrated system that is capable of extracting precise information (knowledge) based on any stored information using the techniques of data mining and soft computing. For the purpose of extracting precise information based on some stored information, it has been further observed that the research work related to the area of knowledge discovery based on certain information with the help of a particular data mining or soft computing model has been done, but the performance based on the particular soft computing or data mining model has not been reviewed as compared to the other models. The comparison of performance of various models in the area of soft computing domain or statistical domain or data mining area have been remained unattended with limitation of the survey. This absence leads to the necessity and carrying out research work for effective knowledge discovery based on a particular set of information on utilizing the versatility and potential view generation soft computing tools. The modified harmony search technique has been proposed in this paper and it has been observed that it has outperformed the other soft computing technique in case of training and tested data. The result of the modified harmony search technique has also been cross checked by the residual error. The concept of harmony search is also applied to other data set to check the optimality of the models.


Data Mining ◽  
2013 ◽  
pp. 1936-1959
Author(s):  
Fernando Alonso ◽  
Loïc Martínez ◽  
Aurora Pérez ◽  
Juan Pedro Valente

Although expert elicited knowledge and data mining discovered knowledge appear to be completely opposite and competing solutions to the same problems, they are actually complementary concepts. Besides, together they maximize their individual qualities. This chapter highlights how each one profits from the other and illustrates their cooperation in existing systems developed in the medical domain. The authors have identified different types of cooperation that combine elicitation and data mining for knowledge acquisition, use expert knowledge to enact the knowledge discovery, use discovered knowledge to validate expert knowledge, and use discovered knowledge to improve the usability of an expert system. The chapter also describes their experience in combining expert and discovered knowledge in the development of a system for processing medical isokinetics data.


2013 ◽  
Vol 385-386 ◽  
pp. 1362-1365
Author(s):  
Wei Min Ouyang ◽  
Qin Hua Huang

Sequential pattern is an important research topic in data mining and knowledge discovery. Traditional algorithms for mining sequential patterns focus on the frequent sequences, which do not consider the infrequent sequences and lifespan of each sequence. On the one hand, some infrequent patterns can provide very useful insight view into the data set, on the other hand, without taking lifespan of each sequence into account, not only some discovered patterns may be invalid, but also some useful patterns may not be discovered. So, we extend the sequential patterns to the indirect temporal sequential patterns, and put forward an algorithm to discover indirect temporal sequential patterns in this paper.


Clustering technique in data mining is a main approach to deal with the data an extraction of useful patterns and knowledge from it. Clustering is involved in the datamining process. Datamining is the way of pulling out the knowledge, information, useful patterns and a reliable data from a huge gigantic amount of raw data as per the needs of the targeted sector. In technical aspects the Data Mining is a way of finding out the useful patterns from the raw data by using the suitable techniques of statistics, Machine learning, and Database techniques. Data mining target two major aspects of extraction of meaning full pattern data for concern of large-scale for better understanding of shapes and profitable patterns of data which impacts globally and the other is small-scale which deals with the lesser impact on the global scale. This paper give a brief overview of Clustering technique under the Data mining process their features and functionality. Majorly concentrate on Clustering technique and their algorithms with the pro’s & con’s and understand the need of clustering and its importance in Data mining process. The Data mining principle is also explained briefly just to build a base to understand the techniques and their importance which has to be discussed


Author(s):  
Beatriz Alicia Leyva-Osuna ◽  
Carlos Armando Jacobo-Hernández ◽  
Ricardo Aguirre-Choix

A study with a quantitative focus was carried out on 140 small companies in the commerce sector of Ciudad Obregón Sonora, which sought to achieve the objective of analyzing the impact of the company's resources and transformational leadership in the process of strategic management; which allowed an analysis of the strategic situation of these companies. For the scope of the objective, the process of the scientific method was used, where a 36-question instrument with six levels of Likert scale was applied, and the SMART-PLS Model, Structural Equations, was used for statistical analysis, where the Resources variable was observed to have a positive, direct and highly significant effect on the Strategic Management variable, on the other hand, Transformational Leadership has a positive and direct but not statistically significant effect on the Strategic Management variable, Strategic Management, (a.k.a. .218, p<-090 n.s.). This research concluded that the entrepreneur is unclear about his leadership and that when a strategy is formulated and executed efficiently through the correct allocation of resources at each stage of it, a successful situation is achieved in the company's strategy.


Author(s):  
Fernando Alonso ◽  
Loïc Martínez ◽  
Aurora Pérez ◽  
Juan Pedro Valente

Although expert elicited knowledge and data mining discovered knowledge appear to be completely opposite and competing solutions to the same problems, they are actually complementary concepts. Besides, together they maximize their individual qualities. This chapter highlights how each one profits from the other and illustrates their cooperation in existing systems developed in the medical domain. The authors have identified different types of cooperation that combine elicitation and data mining for knowledge acquisition, use expert knowledge to enact the knowledge discovery, use discovered knowledge to validate expert knowledge, and use discovered knowledge to improve the usability of an expert system. The chapter also describes their experience in combining expert and discovered knowledge in the development of a system for processing medical isokinetics data.


Author(s):  
Nithya C ◽  
Saravanan V

Data mining is a process which finds useful patterns from large amount of data. The paper discusses few of the data mining techniques, algorithms and some of the organizations which have adapted data mining technology to improve their businesses and found excellent results. The greater part of data mining methods can manage distinctive information sorts.Data mining may be defined as the science of extracting useful information from databases. It also called knowledge discovery. Using a combination of machine learning, statistical analysis, modeling techniques and database technology, data mining finds patterns and subtle relationships in data and infers rules that allow the prediction of future.


Sign in / Sign up

Export Citation Format

Share Document