Self-reproducing learning, data mining and intelligent predictive systems

There are various applications of clustering in the fields of machine learning, data mining, data compression along with pattern recognition. The existent techniques like the Llyods algorithm (sometimes called k-means) were affected by the issue of the algorithm which converges to a local optimum along with no approximation guarantee. For overcoming these shortcomings, an efficient k-means clustering approach is offered by this paper for stream data mining. Coreset is a popular and fundamental concept for k-means clustering in stream data. In each step, reduction determines a coreset of inputs, and represents the error, where P represents number of input points according to nested property of coreset. Hence, a bit reduction in error of final coreset gets n times more accurate. Therefore, this motivated the author to propose a new coreset-reduction algorithm. The proposed algorithm executed on the Covertype dataset, Spambase dataset, Census 1990 dataset, Bigcross dataset, and Tower dataset. Our algorithm outperforms with competitive algorithms like Streamkm[Formula: see text], BICO (BIRCH meets Coresets for k-means clustering), and BIRCH (Balance Iterative Reducing and Clustering using Hierarchies.

Download Full-text

Big Data Models and the Public Sector

Web Services ◽

10.4018/978-1-5225-7501-6.ch007 ◽

2019 ◽

pp. 105-126

Author(s):

N. Nawin Sona

Keyword(s):

Machine Learning ◽

Data Mining ◽

Big Data ◽

Public Sector ◽

Predictive Analytics ◽

Data Types ◽

The Public ◽

Emerging Trends ◽

Wide Range ◽

Learning Data

This chapter aims to give an overview of the wide range of Big Data approaches and technologies today. The data features of Volume, Velocity, and Variety are examined against new database technologies. It explores the complexity of data types, methodologies of storage, access and computation, current and emerging trends of data analysis, and methods of extracting value from data. It aims to address the need for clarity regarding the future of RDBMS and the newer systems. And it highlights the methods in which Actionable Insights can be built into public sector domains, such as Machine Learning, Data Mining, Predictive Analytics and others.

Download Full-text

Efficiency Analysis of Genetic Algorithm and Genetic Programming in Data Mining and Image Processing

Handbook of Research on Manufacturing Process Modeling and Optimization Strategies - Advances in Logistics, Operations, and Management Science ◽

10.4018/978-1-5225-2440-3.ch016 ◽

2017 ◽

pp. 334-360

Author(s):

Ayan Chatterjee ◽

Mahendra Rong

Keyword(s):

Data Mining ◽

Image Processing ◽

Mosaic Generation ◽

Processor Unit ◽

Photo Mosaic ◽

Segmentation Of Images ◽

Graphics Processor Unit ◽

Image Pattern ◽

Learning Data

Today, in the age of artificial intelligence and machine learning, Data mining and Image processing are two important platforms. GA and GP are value based and program based randomized searching tools respectively and these two are very much useful in the fields' data mining and image processing for handling different issues. In this chapter, a review is made on ability of GA and GP in some applications of these two fields. Here, the selected subfields of data mining are market analysis, fraud detection, risk management, sports analysis, protein interaction, classification of data, drug discovery and feature construction. The similar in image processing are enhancement and segmentation of images, face recognition, photo mosaic generation, data embedding, image pattern classification, object detection and Graphics Processor Unit (GPU) development. The efficiencies of GA and GP in these particular applications are analyzed with corresponding parameters, comparing with other non-GA and non-GP approaches of the corresponding subfields.

Download Full-text

The Elements of Statistical Learning, Data Mining, Inference, and Prediction. Trevor Hastie, Robert Tibshirani and Jerome Friedman, Springer, New York, 2001. No. of pages: xvi+533. ISBN 0-387-95284-5

Statistics in Medicine ◽

10.1002/sim.1616 ◽

2004 ◽

Vol 23 (3) ◽

pp. 528-529 ◽

Cited By ~ 2

Author(s):

Hans C. Van Houwelingen

Keyword(s):

Data Mining ◽

New York ◽

Statistical Learning ◽

Jerome Friedman ◽

Learning Data

Download Full-text

Statistical and Machine-Learning Data Mining

10.1201/b11508 ◽

2011 ◽

Cited By ~ 12

Author(s):

Bruce Ratner ◽

Stephen Day ◽

Christopher Davies

Keyword(s):

Machine Learning ◽

Data Mining ◽

Learning Data

Download Full-text

A Review on Various Algorithms used in Machine Learning

International Journal of Scientific Research in Computer Science Engineering and Information Technology ◽

10.32628/cseit1952248 ◽

2019 ◽

pp. 915-920

Author(s):

Divya Chaudhary ◽

Er. Richa Vasuja

Keyword(s):

Machine Learning ◽

Data Mining ◽

New Technologies ◽

Learning Algorithms ◽

Machine Learning Algorithms ◽

Learning System ◽

Training Set ◽

Learning Data ◽

Do So

In today's scenario all of data is being generated by everyone of us . so it becomes vital for us to handle this data. To do so new technologies are being developed such as machine learning, data mining etc. This paper gives the study related to machine learning(ML).Precise approximations are repetitively being produced by Machine Learning algorithms. Machine learning system effectively “learns” how to guess from training set of completed jobs. The main purpose of the review is to give a jagged estimate or overview about the mostly used algorithms in machine learning.

Download Full-text

Using Tic-Tac-Toe for Learning Data Mining Classifications and Evaluations

International Journal of Information and Education Technology ◽

10.7763/ijiet.2013.v3.314 ◽

2013 ◽

pp. 437-441

Author(s):

Chen-Huei Chou

Keyword(s):

Data Mining ◽

Learning Data

Download Full-text

Mining RFID Behavior Data using Unsupervised Learning

International Journal of Applied Logistics ◽

10.4018/jal.2010090203 ◽

2010 ◽

Vol 1 (1) ◽

pp. 28-47 ◽

Cited By ~ 4

Author(s):

Guénaël Cabanes ◽

Younès Bennani ◽

Dominique Fresneau

Keyword(s):

Data Mining ◽

Unsupervised Learning ◽

Radio Frequency Identification ◽

Spatial Organization ◽

Self Organizing Map ◽

Automatic Data ◽

Frequency Identification ◽

Spatio Temporal ◽

Tracking Technology ◽

Learning Data

Radio Frequency IDentification (RFID) is an advanced tracking technology that can be used to study the spatial organization of individual’s spatio-temporal activity. The aim of this work is firstly to build a new RFID-based autonomous system which can follow individuals’ spatio-temporal activity, a tool not currently available. Secondly, the authors aim to develop new tools for automatic data mining. In this paper, they study how to transform these data to investigate the division of labor, the intra-colonial cooperation and conflict in an ant colony. They also develop a new unsupervised learning data mining method (DS2L-SOM: Density based Simultaneous Two-Level - Self Organizing Map) to find homogeneous clusters (i.e., sets of individual which share a similar behavior). According to the experimental results, this method is very fast and efficient. It also allows a very useful visualization of the results.

Download Full-text

Sentiment Analysis of College Reviews using Machine Learning & Data Mining

International Journal for Research in Applied Science and Engineering Technology ◽

10.22214/ijraset.2019.7041 ◽

2019 ◽

Vol 7 (7) ◽

pp. 277-284

Author(s):

Mr. Ghanshyam Gupta

Keyword(s):

Machine Learning ◽

Data Mining ◽

Sentiment Analysis ◽

Learning Data

Download Full-text

Data Mining, Machine Learning, Deep Learning, Chemometrics. Definitions, common points and Trends (Spoiler Alert: VALIDATE your models!)

Brazilian Journal of Analytical Chemistry ◽

10.30744/brjac.2179-3425.ar-38-2021 ◽

2021 ◽

Vol 8 (32) ◽

pp. 22-38

Author(s):

José Manuel Amigo

Keyword(s):

Analytical Chemistry ◽

Artificial Intelligence ◽

Machine Learning ◽

Data Mining ◽

Deep Learning ◽

Daily Life ◽

Big Data Analysis ◽

Mining Machine ◽

Learning Data ◽

Made In

Concepts like Machine Learning, Data Mining or Artificial Intelligence have become part of our daily life. This is mostly due to the incredible advances made in computation (hardware and software), the increasing capabilities of generating and storing all types of data and, especially, the benefits (societal and economical) that generate the analysis of such data. Simultaneously, Chemometrics has played an important role since the late 1970s, analyzing data within natural science (and especially in Analytical Chemistry). Even with the strong parallelisms between all of the abovementioned terms and being popular with most of us, it is still difficult to clearly define or differentiate the meaning of Machine Learning, Data Mining, Artificial Intelligence, Deep Learning and Chemometrics. This manuscript brings some light to the definitions of Machine Learning, Data Mining, Artificial Intelligence and Big Data Analysis, defines their application ranges and seeks an application space within the field of analytical chemistry (a.k.a. Chemometrics). The manuscript is full of personal, sometimes probably subjective, opinions and statements. Therefore, all opinions here are open for constructive discussion with the only purpose of Learning (like the Machines do nowadays).

Download Full-text