A Framework for Organizational Data Analysis and Organizational Data Mining

2008 ◽  
pp. 449-468 ◽  
Author(s):  
Bernd Knobloch

This chapter introduces a framework for organizational data analysis suited for data-driven and hypotheses-driven problems. It shows why knowledge discovery and hypothesis verification are complementary approaches and how they can be chained together. It presents a methodology for organizational data analysis including a comprehensive processing scheme. Employing a plug-in metaphor, data analysis process engineering is introduced as a way to set up data analysis processes based on taxonomies of tasks that have to be performed during data analysis and on the idea of re-using experience from past data analysis projects. The framework aims at increasing the benefits of data mining and other data analysis approaches, by allowing a wider range of business problems to be tackled and by providing the users with structured guidance for planning and running analyses.

2011 ◽  
pp. 334-356 ◽  
Author(s):  
Bernd Knobloch

This chapter introduces a framework for organizational data analysis suited for data-driven and hypotheses-driven problems. It shows why knowledge discovery and hypothesis verification are complementary approaches and how they can be chained together. It presents a methodology for organizational data analysis including a comprehensive processing scheme. Employing a plug-in metaphor, data analysis process engineering is introduced as a way to set up data analysis processes based on taxonomies of tasks that have to be performed during data analysis and on the idea of re-using experience from past data analysis projects. The framework aims at increasing the benefits of data mining and other data analysis approaches, by allowing a wider range of business problems to be tackled and by providing the users with structured guidance for planning and running analyses.


2020 ◽  
Author(s):  
Alessandra Maciel Paz Milani ◽  
Fernando V. Paulovich ◽  
Isabel Harb Manssour

Analyzing and managing raw data are still a challenging part of the data analysis process, mainly regarding data preprocessing. Although we can find studies proposing design implications or recommendations for visualization solutions in the data analysis scope, they do not focus on challenges during the preprocessing phase. Likewise, the current Visual Analytics processes do not consider preprocessing an equally important stage in their process. Thus, with this study, we aim to contribute to the discussion of how we can use and combine methods of visualization and data mining to assist data analysts during the preprocessing activities. To achieve that, we introduce the Preprocessing Profiling Model for Visual Analytics, which contemplates a set of features to inspire the implementation of new solutions. In turn, these features were designed considering a list of insights we obtained during an interview study with thirteen data analysts. Our contributions can be summarized as offering resources to promote a shift to a visual preprocessing.


2008 ◽  
pp. 2734-2748
Author(s):  
Henry Dillon ◽  
Beverley Hope

Knowledge discovery in databases (KDD) is a field of research that studies the development and use of various data analysis tools and techniques. KDD research has produced an array of models, theories, functions and methodologies for producing knowledge from data. However, despite these advances, nearly two thirds of information technology (IT) managers say that data mining products are too difficult to use in a business context. This chapter discusses how advances in data mining translate into the business context. It highlights the art of business implementation rather than the science of KDD.


2011 ◽  
Vol 50 (06) ◽  
pp. 536-544 ◽  
Author(s):  
M. Diomidous ◽  
I. N. Sarkar ◽  
K. Takabayashi ◽  
A. Ziegler ◽  
A. T. McCray ◽  
...  

SummaryBackground: Medicine and biomedical sciences have become data-intensive fields, which, at the same time, enable the application of data-driven approaches and require sophisticated data analysis and data mining methods. Biomedical informatics provides a proper interdisciplinary context to integrate data and knowledge when processing available information, with the aim of giving effective decision-making support in clinics and translational research.Objectives: To reflect on different perspectives related to the role of data analysis and data mining in biomedical informatics. Methods: On the occasion of the 50th year of Methods of Information in Medicine a symposium was organized, which reflected on opportunities, challenges and priorities of organizing, representing and analysing data, information and knowledge in biomedicine and health care. The contributions of experts with a variety of backgrounds in the area of biomedical data analysis have been collected as one outcome of this symposium, in order to provide a broad, though coherent, overview of some of the most interesting aspects of the field.Results: The paper presents sections on data accumulation and data-driven approaches in medical informatics, data and knowledge integration, statistical issues for the evaluation of data mining models, translational bioinformatics and bioinformatics aspects of genetic epidemiology.Conclusions: Biomedical informatics represents a natural framework to properly and effectively apply data analysis and data mining methods in a decision-making context. In the future, it will be necessary to preserve the inclusive nature of the field and to foster an increasing sharing of data and methods between researchers.


2021 ◽  
Vol 4 ◽  
Author(s):  
Shailesh Tripathi ◽  
David Muhr ◽  
Manuel Brunner ◽  
Herbert Jodlbauer ◽  
Matthias Dehmer ◽  
...  

The Cross-Industry Standard Process for Data Mining (CRISP-DM) is a widely accepted framework in production and manufacturing. This data-driven knowledge discovery framework provides an orderly partition of the often complex data mining processes to ensure a practical implementation of data analytics and machine learning models. However, the practical application of robust industry-specific data-driven knowledge discovery models faces multiple data- and model development-related issues. These issues need to be carefully addressed by allowing a flexible, customized and industry-specific knowledge discovery framework. For this reason, extensions of CRISP-DM are needed. In this paper, we provide a detailed review of CRISP-DM and summarize extensions of this model into a novel framework we call Generalized Cross-Industry Standard Process for Data Science (GCRISP-DS). This framework is designed to allow dynamic interactions between different phases to adequately address data- and model-related issues for achieving robustness. Furthermore, it emphasizes also the need for a detailed business understanding and the interdependencies with the developed models and data quality for fulfilling higher business objectives. Overall, such a customizable GCRISP-DS framework provides an enhancement for model improvements and reusability by minimizing robustness-issues.


Author(s):  
Wenhao Shu ◽  
Wenbin Qian ◽  
Yonghong Xie ◽  
Zhaoping Tang

Attribute reduction plays an important role in knowledge discovery and data mining. Confronted with data characterized by the interval and missing values in many data analysis tasks, it is interesting to research the attribute reduction for interval-valued data with missing values. Uncertainty measures can supply efficient viewpoints, which help us to disclose the substantive characteristics of such data. Therefore, this paper addresses the attribute reduction problem based on uncertainty measure for interval-valued data with missing values. At first, an uncertainty measure is provided for measuring candidate attributes, and then an efficient attribute reduction algorithm is developed for the interval-valued data with missing values. To improve the efficiency of attribute reduction, the objects that fall within the positive region are deleted from the whole object set in the process of selecting attributes. Finally, experimental results demonstrate that the proposed algorithm can find a subset of attributes in much shorter time than existing attribute reduction algorithms without losing the classification performance.


2021 ◽  
Vol 5 (4) ◽  
pp. 23-26
Author(s):  
Ning Yang

Enterprise Business Intelligence (BI) system refers to data mining through the existing database of the enterprise, and data analysis according to customer requirements through comprehensive processing. The data analysis efficiency is high and the operation is convenient. This paper mainly analyzes the application of enterprise BI data analysis system in enterprises.


Author(s):  
Tomas Ruzgas ◽  
Kristina Jakubėlienė ◽  
Aistė Buivytė

The article dealt with exploration methods and tools for big data. It identifies the challenges encountered in the analysis of big data. Defined notion of big data. describe the technology for big data analysis. Article provides an overview of tools which are designed for big data analytics.


2014 ◽  
Vol 513-517 ◽  
pp. 1672-1675
Author(s):  
Ding Li

To solve the problem of inefficiency of DM artificial modeling and difficulty of knowledge reuse, this paper studies DM application characteristics, technical characteristics and business data characteristics, probes for the automatic DM modeling method and designs the evaluation system of DM modeling. Based on the DM automatic modeling research and combined MAS, we set up the framework of MAS DM modeling automatic selection, and assess the feasibility of severity selection process by data analysis.


Sign in / Sign up

Export Citation Format

Share Document