Self-reproducing learning, data mining and intelligent predictive systems

Author(s):  
J.K. Huang
Keyword(s):  
Author(s):  
Manmohan Singh ◽  
Rajendra Pamula ◽  
Alok Kumar

There are various applications of clustering in the fields of machine learning, data mining, data compression along with pattern recognition. The existent techniques like the Llyods algorithm (sometimes called k-means) were affected by the issue of the algorithm which converges to a local optimum along with no approximation guarantee. For overcoming these shortcomings, an efficient k-means clustering approach is offered by this paper for stream data mining. Coreset is a popular and fundamental concept for k-means clustering in stream data. In each step, reduction determines a coreset of inputs, and represents the error, where P represents number of input points according to nested property of coreset. Hence, a bit reduction in error of final coreset gets n times more accurate. Therefore, this motivated the author to propose a new coreset-reduction algorithm. The proposed algorithm executed on the Covertype dataset, Spambase dataset, Census 1990 dataset, Bigcross dataset, and Tower dataset. Our algorithm outperforms with competitive algorithms like Streamkm[Formula: see text], BICO (BIRCH meets Coresets for k-means clustering), and BIRCH (Balance Iterative Reducing and Clustering using Hierarchies.


Web Services ◽  
2019 ◽  
pp. 105-126
Author(s):  
N. Nawin Sona

This chapter aims to give an overview of the wide range of Big Data approaches and technologies today. The data features of Volume, Velocity, and Variety are examined against new database technologies. It explores the complexity of data types, methodologies of storage, access and computation, current and emerging trends of data analysis, and methods of extracting value from data. It aims to address the need for clarity regarding the future of RDBMS and the newer systems. And it highlights the methods in which Actionable Insights can be built into public sector domains, such as Machine Learning, Data Mining, Predictive Analytics and others.


Author(s):  
Ayan Chatterjee ◽  
Mahendra Rong

Today, in the age of artificial intelligence and machine learning, Data mining and Image processing are two important platforms. GA and GP are value based and program based randomized searching tools respectively and these two are very much useful in the fields' data mining and image processing for handling different issues. In this chapter, a review is made on ability of GA and GP in some applications of these two fields. Here, the selected subfields of data mining are market analysis, fraud detection, risk management, sports analysis, protein interaction, classification of data, drug discovery and feature construction. The similar in image processing are enhancement and segmentation of images, face recognition, photo mosaic generation, data embedding, image pattern classification, object detection and Graphics Processor Unit (GPU) development. The efficiencies of GA and GP in these particular applications are analyzed with corresponding parameters, comparing with other non-GA and non-GP approaches of the corresponding subfields.


2011 ◽  
Author(s):  
Bruce Ratner ◽  
Stephen Day ◽  
Christopher Davies

Author(s):  
Divya Chaudhary ◽  
Er. Richa Vasuja

In today's scenario all of data is being generated by everyone of us . so it becomes vital for us to handle this data. To do so new technologies are being developed such as machine learning, data mining etc. This paper gives the study related to machine learning(ML).Precise approximations are repetitively being produced by Machine Learning algorithms. Machine learning system effectively “learns” how to guess from training set of completed jobs. The main purpose of the review is to give a jagged estimate or overview about the mostly used algorithms in machine learning.


2010 ◽  
Vol 1 (1) ◽  
pp. 28-47 ◽  
Author(s):  
Guénaël Cabanes ◽  
Younès Bennani ◽  
Dominique Fresneau

Radio Frequency IDentification (RFID) is an advanced tracking technology that can be used to study the spatial organization of individual’s spatio-temporal activity. The aim of this work is firstly to build a new RFID-based autonomous system which can follow individuals’ spatio-temporal activity, a tool not currently available. Secondly, the authors aim to develop new tools for automatic data mining. In this paper, they study how to transform these data to investigate the division of labor, the intra-colonial cooperation and conflict in an ant colony. They also develop a new unsupervised learning data mining method (DS2L-SOM: Density based Simultaneous Two-Level - Self Organizing Map) to find homogeneous clusters (i.e., sets of individual which share a similar behavior). According to the experimental results, this method is very fast and efficient. It also allows a very useful visualization of the results.


2021 ◽  
Vol 8 (32) ◽  
pp. 22-38
Author(s):  
José Manuel Amigo

Concepts like Machine Learning, Data Mining or Artificial Intelligence have become part of our daily life. This is mostly due to the incredible advances made in computation (hardware and software), the increasing capabilities of generating and storing all types of data and, especially, the benefits (societal and economical) that generate the analysis of such data. Simultaneously, Chemometrics has played an important role since the late 1970s, analyzing data within natural science (and especially in Analytical Chemistry). Even with the strong parallelisms between all of the abovementioned terms and being popular with most of us, it is still difficult to clearly define or differentiate the meaning of Machine Learning, Data Mining, Artificial Intelligence, Deep Learning and Chemometrics. This manuscript brings some light to the definitions of Machine Learning, Data Mining, Artificial Intelligence and Big Data Analysis, defines their application ranges and seeks an application space within the field of analytical chemistry (a.k.a. Chemometrics). The manuscript is full of personal, sometimes probably subjective, opinions and statements. Therefore, all opinions here are open for constructive discussion with the only purpose of Learning (like the Machines do nowadays).


Sign in / Sign up

Export Citation Format

Share Document