Research and application on algorithms of data mining for EMU malfunction’s data under a cloud computing environment

Author(s):  
C. Zhang ◽  
H. Hu
2015 ◽  
Vol 713-715 ◽  
pp. 2447-2450
Author(s):  
Zhan Kun Zhao

Efficient data mining model design for a large database in the cloud computing environment is studied. For large databases efficiently mining problem, an efficient data mining model in the cloud computing environment based on improved manifold learning algorithms is proposed. The use of nonlinear manifold learning algorithms is able to reduce dimensionality of data vector feature in cloud computing environments, through characteristic extraction module to preprocess data, improved classical manifold learning algorithm is adopted to increase the distance between the data of sample spread intensive area and shorten the distance between the data of sample spread sparse area, prompting even overall distribution of sample database under cloud computing environment, so as to achieve accurate mining for efficient data in cloud computing environment. The experimental results show that the proposed method can accurately mine target data under cloud computing environments, with high efficiency and precision.


2015 ◽  
Vol 719-720 ◽  
pp. 924-928 ◽  
Author(s):  
Xiao Chun Sheng ◽  
Xiao Feng Xue ◽  
Yan Ping Cheng

Cloud computing is computing tasks distribution resources of a large number of computers in the subnet, to provide users with cheap and efficient computing power, storage capacity and service capabilities. Data mining is to find useful information in large data repository. Frequent flow of large amounts of data quickly and accurately find important basis for forecasting and decision, therefore, under the cloud computing environment parallelization frequent item data mining strategy to provide efficient solutions to store and analyze vast amounts of data has important theoretical significanceand application value.


2020 ◽  
Vol 2020 ◽  
pp. 1-11
Author(s):  
Jin Gao ◽  
Jiaquan Liu ◽  
Sihua Guo ◽  
Qi Zhang ◽  
Xinyang Wang

Aiming at problems such as slow training speed, poor prediction effect, and unstable detection results of traditional anomaly detection algorithms, a data mining method for anomaly detection based on the deep variational dimensionality reduction model and MapReduce (DMAD-DVDMR) in cloud computing environment is proposed. First of all, the data are preprocessed by a dimensionality reduction model based on deep variational learning and based on ensuring complete data information as much as possible, the dimensionality of the data is reduced, and the computational pressure is reduced. Secondly, the data set stored on the Hadoop Distributed File System (HDFS) is logically divided into several data blocks, and the data blocks are processed in parallel through the principle of MapReduce, so the k-distance and LOF value of each data point can only be calculated in each block. Thirdly, based on stochastic gradient descent, the concept of k-neighboring distance is redefined, thus avoiding the situation where there are greater than or equal to k-repeated points and infinite local density in the data set. Finally, compared with CNN, DeepAnt, and SVM-IDS algorithms, the accuracy of the scheme is increased by 10.3%, 18.0%, and 17.2%, respectively. The experimental data set verifies the effectiveness and scalability of the proposed DMAD-DVDMR algorithm.


Sign in / Sign up

Export Citation Format

Share Document