Performance Evaluation of Map Reduce vs. Spark framework on Amazon Machine Image for TeraSort Algorithm
2021 ◽
Vol 9
(VI)
◽
pp. 2728-2732
Keyword(s):
TeraSort is one of Hadoop’s widely used benchmarks. Hadoop’s distribution contains both the input generator and sorting implementations: the TeraGen generates the input and TeraSort conducts the sorting. We focus on the comparison of TeraSort algorithm on the different distributed platforms with different configurations of the resources. We have considered the parameters of measure of efficiency as Compute Time, Data Read, Data Write, Compute Time, and Speedup. We have conducted experiments using Hadoop map reduce and Spark (Java). We empirically evaluate the performance of TeraSort algorithm on Amazon EC2 Machine Images, and demonstrate that it achieves 3.95 × - 2.4 × speedup, compared with TeraSort, for typical settings of interest.
Keyword(s):
2019 ◽
Vol 887
◽
pp. 641-649
Keyword(s):
Keyword(s):
2017 ◽
Vol 13
(08)
◽
pp. 121
◽
Keyword(s):
Keyword(s):