A Review of Data Mining Techniques and Applications

Author(s):  
Ratchakoon Pruengkarn ◽  
◽  
Kok Wai Wong ◽  
Chun Che Fung

Data mining is the analytics and knowledge discovery process of analyzing large volumes of data from various sources and transforming the data into useful information. Various disciplines have contributed to its development and is becoming increasingly important in the scientific and industrial world. This article presents a review of data mining techniques and applications from 1996 to 2016. Techniques are divided into two main categories: predictive methods and descriptive methods. Due to the huge number of publications available on this topic, only a selected number are used in this review to highlight the developments of the past 20 years. Applications are included to provide some insights into how each data mining technique has evolved over the last two decades. Recent research trends focus more on large data sets and big data. Recently there have also been more applications in area of health informatics with the advent of newer algorithms.

Author(s):  
Ana Cristina Bicharra Garcia ◽  
Inhauma Ferraz ◽  
Adriana S. Vivacqua

AbstractMost past approaches to data mining have been based on association rules. However, the simple application of association rules usually only changes the user's problem from dealing with millions of data points to dealing with thousands of rules. Although this may somewhat reduce the scale of the problem, it is not a completely satisfactory solution. This paper presents a new data mining technique, called knowledge cohesion (KC), which takes into account a domain ontology and the user's interest in exploring certain data sets to extract knowledge, in the form of semantic nets, from large data sets. The KC method has been successfully applied to mine causal relations from oil platform accident reports. In a comparison with association rule techniques for the same domain, KC has shown a significant improvement in the extraction of relevant knowledge, using processing complexity and knowledge manageability as the evaluation criteria.


2020 ◽  
Vol 1 (1) ◽  
pp. 31-40
Author(s):  
Hina Afzal ◽  
Arisha Kamran ◽  
Asifa Noreen

The market nowadays, due to the rapid changes happening in the technologies requires a high level of interaction between the educators and the fresher coming to going the market. The demand for IT-related jobs in the market is higher than all other fields, In this paper, we are going to discuss the survival analysis in the market of parallel two programming languages Python and R . Data sets are growing large and the traditional methods are not capable enough of handling the large data sets, therefore, we tried to use the latest data mining techniques through python and R programming language. It took several months of effort to gather such an amount of data and process it with the data mining techniques using python and R but the results showed that both languages have the same rate of growth over the past years.


2021 ◽  
pp. 1826-1839
Author(s):  
Sandeep Adhikari, Dr. Sunita Chaudhary

The exponential growth in the use of computers over networks, as well as the proliferation of applications that operate on different platforms, has drawn attention to network security. This paradigm takes advantage of security flaws in all operating systems that are both technically difficult and costly to fix. As a result, intrusion is used as a key to worldwide a computer resource's credibility, availability, and confidentiality. The Intrusion Detection System (IDS) is critical in detecting network anomalies and attacks. In this paper, the data mining principle is combined with IDS to efficiently and quickly identify important, secret data of interest to the user. The proposed algorithm addresses four issues: data classification, high levels of human interaction, lack of labeled data, and the effectiveness of distributed denial of service attacks. We're also working on a decision tree classifier that has a variety of parameters. The previous algorithm classified IDS up to 90% of the time and was not appropriate for large data sets. Our proposed algorithm was designed to accurately classify large data sets. Aside from that, we quantify a few more decision tree classifier parameters.


Author(s):  
Scott Nicholson ◽  
Jeffrey Stanton

Most people think of a library as the little brick building in the heart of their community or the big brick building in the center of a campus. These notions greatly oversimplify the world of libraries, however. Most large commercial organizations have dedicated in-house library operations, as do schools, non-governmental organizations, as well as local, state, and federal governments. With the increasing use of the Internet and the World Wide Web, digital libraries have burgeoned, and these serve a huge variety of different user audiences. With this expanded view of libraries, two key insights arise. First, libraries are typically embedded within larger institutions. Corporate libraries serve their corporations, academic libraries serve their universities, and public libraries serve taxpaying communities who elect overseeing representatives. Second, libraries play a pivotal role within their institutions as repositories and providers of information resources. In the provider role, libraries represent in microcosm the intellectual and learning activities of the people who comprise the institution. This fact provides the basis for the strategic importance of library data mining: By ascertaining what users are seeking, bibliomining can reveal insights that have meaning in the context of the library’s host institution. Use of data mining to examine library data might be aptly termed bibliomining. With widespread adoption of computerized catalogs and search facilities over the past quarter century, library and information scientists have often used bibliometric methods (e.g., the discovery of patterns in authorship and citation within a field) to explore patterns in bibliographic information. During the same period, various researchers have developed and tested data mining techniques—advanced statistical and visualization methods to locate non-trivial patterns in large data sets. Bibliomining refers to the use of these bibliometric and data mining techniques to explore the enormous quantities of data generated by the typical automated library.


2017 ◽  
Vol 7 (1.1) ◽  
pp. 286
Author(s):  
B. Sekhar Babu ◽  
P. Lakshmi Prasanna ◽  
P. Vidyullatha

 In current days, World Wide Web has grown into a familiar medium to investigate the new information, Business trends, trading strategies so on. Several organizations and companies are also contracting the web in order to present their products or services across the world. E-commerce is a kind of business or saleable transaction that comprises the transfer of statistics across the web or internet. In this situation huge amount of data is obtained and dumped into the web services. This data overhead tends to arise difficulties in determining the accurate and valuable information, hence the web data mining is used as a tool to determine and mine the knowledge from the web. Web data mining technology can be applied by the E-commerce organizations to offer personalized E-commerce solutions and better meet the desires of customers. By using data mining algorithm such as ontology based association rule mining using apriori algorithms extracts the various useful information from the large data sets .We are implementing the above data mining technique in JAVA and data sets are dynamically generated while transaction is processing and extracting various patterns.


2001 ◽  
Vol 79 (7) ◽  
pp. 1209-1231 ◽  
Author(s):  
Rich Mooi

The fossil record of the Echinodermata is relatively complete, and is represented by specimens retaining an abundance of features comparable to that found in extant forms. This yields a half-billion-year record of evolutionary novelties unmatched in any other major group, making the Echinodermata a primary target for studies of biological change. Not all of this change can be understood by studying the rocks alone, leading to synthetic research programs. Study of literature from the past 20 years indicates that over 1400 papers on echinoderm paleontology appeared in that time, and that overall productivity has remained almost constant. Analysis of papers appearing since 1990 shows that research is driven by new finds including, but not restricted to, possible Precambrian echinoderms, bizarre new edrioasteroids, early crinoids, exquisitely preserved homalozoans, echinoids at the K-T boundary, and Antarctic echinoids, stelleroids, and crinoids. New interpretations of echinoderm body wall homologies, broad-scale syntheses of embryological information, the study of developmental trajectories through molecular markers, and the large-scale ecological and phenotypic shifts being explored through morphometry and analyses of large data sets are integrated with study of the fossils themselves. Therefore, recent advances reveal a remarkable and continuing synergistic expansion in our understanding of echinoderm evolutionary history.


Sign in / Sign up

Export Citation Format

Share Document