scholarly journals An online classification algorithm for large scale data streams: iGNGSVM

2017 ◽  
Vol 262 ◽  
pp. 67-76 ◽  
Author(s):  
Andrés L. Suárez-Cetrulo ◽  
Alejandro Cervantes
2020 ◽  
Vol 204 ◽  
pp. 106186 ◽  
Author(s):  
Fang Liu ◽  
Yanwei Yu ◽  
Peng Song ◽  
Yangyang Fan ◽  
Xiangrong Tong

2016 ◽  
Vol 194 ◽  
pp. 107-116 ◽  
Author(s):  
Jingsong Shan ◽  
Jianxin Luo ◽  
Guiqiang Ni ◽  
Zhaofeng Wu ◽  
Weiwei Duan

Author(s):  
Jon R. Wright ◽  
Gregg T. Vesonder ◽  
Tamraparni Dasu

In an enterprise setting, a major challenge for any data mining operation is managing data streams or feeds, both data and metadata, to ensure a stable and certifiably accurate flow of data. Data feeds in this environment can be complex, numerous and opaque. The management of frequently changing data and metadata presents a considerable challenge. In this paper, we articulate the technical issues involved in the task of managing enterprise data and propose a multi-disciplinary solution, derived from fields such as knowledge engineering and statistics, to understand, standardize, and automate information acquisition and quality management in preparation for enterprise mining.


Author(s):  
Bing Xu

In the process of e-commerce transactions, a large amount of data will be generated, whose effective classification is one of current research hotspots. An improved feature selection method was proposed based on the characteristics of Bayesian classification algorithm. Due to the long training and testing time of modern large-scale data classification on a single computer, a data classification algorithm based on Naive Bayes was designed and implemented on the Hadoop distributed platform. The experimental results showed that the improved algorithm could effectively improve the accuracy of classification, and the designed parallel Bayesian data classification algorithm had high efficiency, which was suitable for the processing and analysis of massive data.


Sign in / Sign up

Export Citation Format

Share Document