Process-Based Data Mining

Author(s):  
Karim K. Hirji

In contrast to the Industrial Revolution, the Digital Revolution is happening much more quickly. For example, in 1946, the world’s first programmable computer, the Electronic Numerical Integrator and Computer (ENIAC), stood 10 feet tall, stretched 150 feet wide, cost millions of dollars, and could execute up to 5,000 operations per second. Twenty- five years later, Intel packed 12 times ENIAC’s processing power into a 12–square-millimeter chip. Today’s personal computers with Pentium processors perform in excess of 400 million instructions per second. Database systems, a subfield of computer science, has also met with notable accelerated advances. A major strength of database systems is their ability to store volumes of complex, hierarchical, heterogeneous, and time-variant data and to provide rapid access to information while correctly capturing and reflecting database updates. Together with the advances in database systems, our relationship with data has evolved from the prerelational and relational period to the data-warehouse period. Today, we are in the knowledge-discovery and data-mining (KDDM) period where the emphasis is not so much on identifying ways to store data or on consolidating and aggregating data to provide a single, unified perspective. Rather, the emphasis of KDDM is on sifting through large volumes of historical data for new and valuable information that will lead to competitive advantage. The evolution to KDDM is natural since our capabilities to produce, collect, and store information have grown exponentially. Debit cards, electronic banking, e-commerce transactions, the widespread introduction of bar codes for commercial products, and advances in both mobile technology and remote sensing data-capture devices have all contributed to the mountains of data stored in business, government, and academic databases. Traditional analytical techniques, especially standard query and reporting and online analytical processing, are ineffective in situations involving large amounts of data and where the exact nature of information one wishes to extract is uncertain. Data mining has thus emerged as a class of analytical techniques that go beyond statistics and that aim at examining large quantities of data; data mining is clearly relevant for the current KDDM period. According to Hirji (2001), data mining is the analysis and nontrivial extraction of data from databases for the purpose of discovering new and valuable information, in the form of patterns and rules, from relationships between data elements. Data mining is receiving widespread attention in the academic and public press literature (Berry & Linoff, 2000; Fayyad, Piatetsky-Shapiro, & Smyth, 1996; Kohavi, Rothleder, & Simoudis, 2002; Newton, Kendziorski, Richmond, & Blattner, 2001; Venter, Adams, & Myers, 2001; Zhang, Wang, Ravindranathan, & Miles, 2002), and case studies and anecdotal evidence to date suggest that organizations are increasingly investigating the potential of data-mining technology to deliver competitive advantage.

2008 ◽  
pp. 343-349
Author(s):  
Karim K. Hirji

In contrast to the Industrial Revolution, the Digital Revolution is happening much more quickly. For example, in 1946, the world’s first programmable computer, the Electronic Numerical Integrator and Computer (ENIAC), stood 10 feet tall, stretched 150 feet wide, cost millions of dollars, and could execute up to 5,000 operations per second. Twenty-five years later, Intel packed 12 times ENIAC’s processing power into a 12–square-millimeter chip. Today’s personal computers with Pentium processors perform in excess of 400 million instructions per second. Database systems, a subfield of computer science, has also met with notable accelerated advances. A major strength of database systems is their ability to store volumes of complex, hierarchical, heterogeneous, and time-variant data and to provide rapid access to information while correctly capturing and reflecting database updates.


Author(s):  
Karim K. Hirji

In contrast to the Industrial Revolution, the Digital Revolution is happening much more quickly. For example, in 1946, the world’s first programmable computer, the Electronic Numerical Integrator and Computer (ENIAC), stood 10 feet tall, stretched 150 feet wide, cost millions of dollars, and could execute up to 5,000 operations per second. Twenty-five years later, Intel packed 12 times ENIAC’s processing power into a 12–square-millimeter chip. Today’s personal computers with Pentium processors perform in excess of 400 million instructions per second. Database systems, a subfield of computer science, has also met with notable accelerated advances. A major strength of database systems is their ability to store volumes of complex, hierarchical, heterogeneous, and time-variant data and to provide rapid access to information while correctly capturing and reflecting database updates.


Author(s):  
Stephan Kudyba ◽  
Richard Hoptroff

Over the years, the term data mining has been connected to various types of analytical approaches. In fact, just a few years ago, let’s say prior to 1995, many individuals in the software industry and business users as well, often referred to OLAP as a main component of data mining technology. More recently however, this term has taken on a new meaning and one which will most likely prevail for years to come. As we mentioned in the previous chapter, data mining technology encompasses such methodologies as clustering, classification and segmentation, association, neural networks and regression as the main players in this space. Other analytical processes which are related to mining, as defined in this work, include such methodologies as Linear Programming, Monte Carlo analysis and Bayesian methodologies. In fact, depending on who you ask, these techniques may actually be considered part of the data mining spectrum since they are grounded in mathematical techniques applied to historical data. The focus of this work however, revolves around the former more core approaches. Regardless of the type of methodology, data mining has taken its roots from traditional analytical techniques. Enhancements in computer processing, (e.g., speed and processing power) has enabled a wider diffusion of more complex techniques to become more automated and user friendly and have evolved to the state of our current data mining.


Author(s):  
Aastha Gupta ◽  
Himanshu Sharma ◽  
Anas Akhtar

Clustering is the process of arranging comparable data elements into groups. One of the most frequent data mining analytical techniques is clustering analysis; the clustering algorithm’s strategy has a direct influence on the clustering results. This study examines the many types of algorithms, such as k-means clustering algorithms, and compares and contrasts their advantages and disadvantages. This paper also highlights concerns with clustering algorithms, such as time complexity and accuracy, in order to give better outcomes in a variety of environments. The outcomes are described in terms of big datasets. The focus of this study is on clustering algorithms with the WEKA data mining tool. Clustering is the process of dividing a big data set into small groups or clusters. Clustering is an unsupervised approach that may be used to analyze big datasets with many characteristics. It’s a data-modeling technique that provides a clear image of your data. Two clustering methods, k-means and hierarchical clustering, are explained in this survey and their analysis using WEKA tool on different data sets. KEYWORDS: data clustering, weka , k-means, hierarchical clustering


2017 ◽  
Vol 4 (2) ◽  
pp. 87-93
Author(s):  
Immanuel Luigi Da Gusta ◽  
Johan Setiawan

The aim of this paper are: to create a data visualization that can assist the Government in evaluating the return on the development of health facilities in the region and province area in term of human resources for medical personnel, to help community knowing the amount of distribution of hospitals with medical personnel in the regional area and to map disease indicator in Indonesia. The issue of tackling health is still a major problem that is not resolved by the Government of Indonesia. There are three big things that become problems in the health sector in Indonesia: infrastructure has not been evenly distributed and less adequate, the lack of human resources professional health workforce, there is still a high number of deaths in the outbreak of infectious diseases. Data for the research are taken from BPS, in total 10,600 records after the Extract, Transform and Loading process. Time needed to convert several publications from PDF, to convert to CSV and then to MS Excel 3 weeks. The method used is Eight-step Data Visualization and Data Mining methodology. Tableau is chosen as a tool to create the data visualization because it can combine each dasboard inside a story interactive, easier for the user to analyze the data. The result is a story with 3 dashboards that can fulfill the requirement from BPS staff and has been tested with a satisfied result in the UAT (User Acceptance Test). Index Terms—Dashboard, data visualization, disease, malaria, Tableau REFERENCES [1] S. Arianto, Understanding of learning and others, 2008. [2] Rainer; Turban, Introduction to Information Systems, Danvers: John Wiley & Sons, Inc, 2007. [3] V. Friedman, Data Visualization Infographics, Monday Inspirition, 2008. [4] D. A. Keim, "Information Visualization and Visual Data Mining," IEEE Transactions on Visualization and Computer Graphics 8.1, pp. 1-8, 2002. [5] Connolly and Begg, Database Systems, Boston: Pearson Education, Inc, 2010. [6] E. Hariyanti, "Pengembangan Metodologi Pembangunan Information Dashboard Untuk Monitoring kinerja Organisasi," Konferensi dan Temu Nasional Teknologi Informasi dan Komunikasi untuk Indonesia, p. 1, 2008. [7] S. Darudiato, "Perancangan Data Warehouse Penjualan Untuk Mendukung Kebutuhan Informasi Eksekutif Cemerlang Skin Care," Seminar Nasional Informatika 2010, pp. E-353, 2010.


Author(s):  
Gary Smith

We live in an incredible period in history. The Computer Revolution may be even more life-changing than the Industrial Revolution. We can do things with computers that could never be done before, and computers can do things for us that could never be done before. But our love of computers should not cloud our thinking about their limitations. We are told that computers are smarter than humans and that data mining can identify previously unknown truths, or make discoveries that will revolutionize our lives. Our lives may well be changed, but not necessarily for the better. Computers are very good at discovering patterns, but are useless in judging whether the unearthed patterns are sensible because computers do not think the way humans think. We fear that super-intelligent machines will decide to protect themselves by enslaving or eliminating humans. But the real danger is not that computers are smarter than us, but that we think computers are smarter than us and, so, trust computers to make important decisions for us. The AI Delusion explains why we should not be intimidated into thinking that computers are infallible, that data-mining is knowledge discovery, and that black boxes should be trusted.


2021 ◽  
Author(s):  
Pang william panggantara

Fourth wave of industrial revolution is marked by the use of information technology, artificial inteligence (A.I), and automatic engines. Competitive advantage has become a necessity for every business actor when they wants to competing in the global market. The current condition definitely encouraging the occurence of massive transformation at all business levels and units this condition happens because every business actor can enter from and any other countries markets easily. this condition making professionalism of every business actor is highly prioritized like many case in the business decision making and continous innovation.


2019 ◽  
Vol 9 (2) ◽  
Author(s):  
Ika Purwanti ◽  
Muhammad Dzikri Abadi ◽  
Umar Yeni Suyanto

This study would like to explains conceptual green marketing and its role as a source sustainable competitive advantage in industrial revolution 4.0. The environmental issue is a sizzling topic nowadays as almost every country’s government and society has started to be more aware of these issues. Plus, there is currently a phenomenon of industrial revolution 4.0 which demands business practices to be more consumer-oriented. Public concern over environmental damage has made marketers know the needs and value of environmentally friendly marketing, namely green marketing. which is a new strength to create a sustainable competitive advantage. This study is a library research gathering and analyzing information from related references and theories, which have become the basic foundation and sources in analyzing problems in this research. This study seeks to offer Green Marketing ideas as the latest approach in dealing with various business threats. The results show that green marketing able to encourage companies to prepare themselves faster and better, the definition of green marketing has changed over time according to the growing relevance of environmental sustainability. 


Author(s):  
R. Neni Kusumadewi ◽  
Otong Karyono

Current competitive environment induced by 4.0 industrial revolution has forced companies to focus on managing service to customer by provide added value to customers, so that it will increase competitiveness. This study aims to find out and analyze impact of service quality and service innovations on competitive advantage. Analysis method is descriptive statistical and Structural Equation Modeling with AMOS software. The sampling Technique was purposive sampling with combination cluster proportional stratified random sampling, the instrument to collect the data was questionnaire with manager, supervisor or employee of retailing. The results indicate that the service quality and service innovations impact on competitive advantage.


Sign in / Sign up

Export Citation Format

Share Document