scholarly journals Comparing Apache Spark and Map Reduce with Performance Analysis using K-Means

2015 ◽  
Vol 113 (1) ◽  
pp. 8-11 ◽  
Author(s):  
Satish Gopalani ◽  
Rohan Arora
Author(s):  
Armin Moharrer ◽  
Stratis Ioannidis

We identify structural properties under which a convex optimization over the simplex can be massively parallelized via map-reduce operations using the Frank-Wolfe (FW) algorithm. A broad class of problems, e.g., Convex Approximation, Experimental Designs, and Adaboost, can be tackled this way. We implement FW over Apache Spark, and solve problems with 20 million variables using 350 cores in 79 minutes; the same operation takes 165 hours when executed serially.


Sign in / Sign up

Export Citation Format

Share Document