A distributed computing framework based on lightweight variance reduction method to accelerate machine learning training on blockchain

2020 ◽  
Vol 17 (9) ◽  
pp. 77-89
Zhen Huang ◽  
Feng Liu ◽  
Mingxing Tang ◽  
Jinyan Qiu ◽  
Yuxing Peng
2020 ◽  
Vol 32 (1) ◽  
pp. 188-202 ◽  
Fanhua Shang ◽  
Kaiwen Zhou ◽  
Hongying Liu ◽  
James Cheng ◽  
Ivor W. Tsang ◽  

2019 ◽  
Vol 20 (4) ◽  
Ahmed Hussein Ali ◽  
Mahmood Zaki Abdullah

The big data concept has elicited studies on how to accurately and efficiently extract valuable information from such huge dataset. The major problem during big data mining is data dimensionality due to a large number of dimensions in such datasets. This major consequence of high data dimensionality is that it affects the accuracy of machine learning (ML) classifiers; it also results in time wastage due to the presence of several redundant features in the dataset. This problem can be possibly solved using a fast feature reduction method. Hence, this study presents a fast HP-PL which is a new hybrid parallel feature reduction framework that utilizes spark to facilitate feature reduction on shared/distributed-memory clusters. The evaluation of the proposed HP-PL on KDD99 dataset showed the algorithm to be significantly faster than the conventional feature reduction techniques. The proposed technique required 1 minute to select 4 dataset features from over 79 features and 3,000,000 samples on a 3-node cluster (total of 21 cores). For the comparative algorithm, more than 2 hours was required to achieve the same feat. In the proposed system, Hadoop’s distributed file system (HDFS) was used to achieve distributed storage while Apache Spark was used as the computing engine. The model development was based on a parallel model with full consideration of the high performance and throughput of distributed computing. Conclusively, the proposed HP-PL method can achieve good accuracy with less memory and time compared to the conventional methods of feature reduction. This tool can be publicly accessed at https://github.com/ahmed/Fast-HP-PL.

2019 ◽  
Vol 214 ◽  
pp. 00001
Alessandra Forti ◽  
Latchezar Betev ◽  
Maarten Litmaath ◽  
Oxana Smirnova ◽  
Petya Vasileva ◽  

The 23rd International Conference on Computing in High Energy and Nuclear Physics (CHEP) took place in the National Palace of Culture, Sofia, Bulgaria from 9th to 13th of July 2018. 575 participants joined the plenary and the eight parallel sessions dedicated to: online computing; offline computing; distributed computing; data handling; software development; machine learning and physics analysis; clouds, virtualisation and containers; networks and facilities. The conference hosted 35 plenary presentations, 323 parallel presentations and 188 posters.

2016 ◽  
Vol 129 (972) ◽  
pp. 024001 ◽  
Shoulin Wei ◽  
Feng Wang ◽  
Hui Deng ◽  
Cuiyin Liu ◽  
Wei Dai ◽  

2019 ◽  
Vol 35 (2) ◽  
pp. 157-170
Mohd Khairul Bazli Mohd Aziz ◽  
Fadhilah Yusof ◽  
Zalina Mohd Daud ◽  
Zulkifli Yusop ◽  
Mohammad Afif Kasno

The well-known geostatistics method (variance-reduction method) is commonly used to determine the optimal rain gauge network. The main problem in geostatistics method to determine the best semivariogram model in order to be used in estimating the variance. An optimal choice of the semivariogram model is an important point for a good data evaluation process. Three different semivariogram models which are Spherical, Gaussian and Exponential are used and their performances are compared in this study. Cross validation technique is applied to compute the errors of the semivariograms. Rain-fall data for the period of 1975 – 2008 from the existing 84 rain gauge stations covering the state of Johor are used in this study. The result shows that the exponential model is the best semivariogram model and chosen to determine the optimal number and location of rain gauge station.

2020 ◽  
Vol 7 (4) ◽  
pp. 3640-3649
Mingjun Dai ◽  
Ziying Zheng ◽  
Shengli Zhang ◽  
Hui Wang ◽  
Xiaohui Lin

2020 ◽  
Vol 8 (3) ◽  
pp. 1139-1188
Aaron R. Dinner ◽  
Erik H. Thiede ◽  
Brian Van Koten ◽  
Jonathan Weare

Sign in / Sign up

Export Citation Format

Share Document