Implementation of Network Cards Optimizations in Hadoop Cluster Data Transmissions

Okta Nurika; Mohd Fadzil Hassan; Nordin Zakaria

doi:10.4108/eai.21-12-2017.153506

Pemanfaatan Metode K-Means Dalam Penentuan Persediaan Barang

PIKSEL : Penelitian Ilmu Komputer Sistem Embedded and Logic ◽

10.33558/piksel.v6i1.1398 ◽

2018 ◽

Vol 6 (1) ◽

pp. 41-48

Author(s):

Santoso Setiawan

Keyword(s):

Data Mining ◽

Stock Management ◽

Transaction Data ◽

Business People ◽

Cluster Data ◽

Data Clusters ◽

Inventory Strategy ◽

Transactional Data

Abstract Inaccurate stock management will lead to high and uneconomical storage costs, as there may be a void or surplus of certain products. This will certainly be very dangerous for all business people. The K-Means method is one of the techniques that can be used to assist in designing an effective inventory strategy by utilizing the sales transaction data that is already available in the company. The K-Means algorithm will group the products sold into several large transactional data clusters, so it is expected to help entrepreneurs in designing stock inventory strategies. Keywords: inventory, k-means, product transaction data, rapidminer, data mining Abstrak Manajemen stok yang tidak akurat akan menyebabkan biaya penyimpanan yang tinggi dan tidak ekonomis, karena kemungkinan terjadinya kekosongan atau kelebihan produk tertentu. Hal ini sangat berbahaya bagi para pelaku bisnis. Metode K-Means adalah salah satu teknik yang dapat digunakan untuk membantu dalam merancang strategi persediaan yang efektif dengan memanfaatkan data transaksi penjualan yang telah tersedia di perusahaan. Algoritma K-Means akan mengelompokkan produk yang dijual ke beberapa cluster data transaksi yang umumnya besar, sehingga diharapkan dapat membantu pengusaha dalam merancang strategi persediaan stok. Kata kunci: data transaksi produk, k-means, persediaan, rapidminer, data mining.

Download Full-text

A Cluster Data Management Algorithm Based on Data Correlation of Wireless Sensor Networks

ACTA AUTOMATICA SINICA ◽

10.3724/sp.j.1004.2010.1343 ◽

2010 ◽

Vol 36 (9) ◽

pp. 1343-1350 ◽

Cited By ~ 6

Author(s):

Min XIANG ◽

Wei-Ren SHI

Keyword(s):

Wireless Sensor Networks ◽

Sensor Networks ◽

Data Management ◽

Wireless Sensor ◽

Data Correlation ◽

Management Algorithm ◽

Cluster Data

Download Full-text

Big Data Security Challenges and Solution of Distributed Computing in Hadoop Environment: A Security Framework

Recent Advances in Computer Science and Communications ◽

10.2174/2213275912666190822095422 ◽

2020 ◽

Vol 13 (4) ◽

pp. 790-797

Author(s):

Gurjit Singh Bhathal ◽

Amardeep Singh Dhiman

Keyword(s):

Big Data ◽

Data Security ◽

Data Sets ◽

Security Framework ◽

Hadoop Distributed File System ◽

Current Scenario ◽

Hadoop Cluster ◽

Ciphertext Policy ◽

In Transit ◽

Hadoop Framework

Background: In current scenario of internet, large amounts of data are generated and processed. Hadoop framework is widely used to store and process big data in a highly distributed manner. It is argued that Hadoop Framework is not mature enough to deal with the current cyberattacks on the data. Objective: The main objective of the proposed work is to provide a complete security approach comprising of authorisation and authentication for the user and the Hadoop cluster nodes and to secure the data at rest as well as in transit. Methods: The proposed algorithm uses Kerberos network authentication protocol for authorisation and authentication and to validate the users and the cluster nodes. The Ciphertext-Policy Attribute- Based Encryption (CP-ABE) is used for data at rest and data in transit. User encrypts the file with their own set of attributes and stores on Hadoop Distributed File System. Only intended users can decrypt that file with matching parameters. Results: The proposed algorithm was implemented with data sets of different sizes. The data was processed with and without encryption. The results show little difference in processing time. The performance was affected in range of 0.8% to 3.1%, which includes impact of other factors also, like system configuration, the number of parallel jobs running and virtual environment. Conclusion: The solutions available for handling the big data security problems faced in Hadoop framework are inefficient or incomplete. A complete security framework is proposed for Hadoop Environment. The solution is experimentally proven to have little effect on the performance of the system for datasets of different sizes.

Download Full-text

Evaluating Huge Matrix Multiplication on Real Hadoop Cluster

Proceedings of the 2020 5th International Conference on Intelligent Information Technology ◽

10.1145/3385209.3385220 ◽

2020 ◽

Author(s):

Phan Duy Hung ◽

Nguyen Ngoc Chung

Keyword(s):

Matrix Multiplication ◽

Hadoop Cluster

Download Full-text

Historical data based approach for straggler avoidance in a heterogeneous Hadoop cluster

Journal of Ambient Intelligence and Humanized Computing ◽

10.1007/s12652-020-02699-0 ◽

2021 ◽

Author(s):

Kamalakant Laxman Bawankule ◽

Rupesh Kumar Dewang ◽

Anil Kumar Singh

Keyword(s):

Historical Data ◽

Hadoop Cluster

Download Full-text

Transforming a Local Medical Image Analysis for Running on a Hadoop Cluster

Procedia Computer Science ◽

10.1016/j.procs.2017.05.227 ◽

2017 ◽

Vol 108 ◽

pp. 1622-1631

Author(s):

Marco Strutz ◽

Hermann Heßling ◽

Achim streit

Keyword(s):

Image Analysis ◽

Medical Image ◽

Medical Image Analysis ◽

Hadoop Cluster

Download Full-text

Equilibrium vapor pressure and surface tension from cluster data: Density functional results

The Journal of Chemical Physics ◽

10.1063/1.2996180 ◽

2008 ◽

Vol 129 (15) ◽

pp. 154507 ◽

Cited By ~ 1

Author(s):

Ismo Napari

Keyword(s):

Surface Tension ◽

Vapor Pressure ◽

Density Functional ◽

Functional Results ◽

Cluster Data ◽

Data Density ◽

Equilibrium Vapor Pressure

Download Full-text

Non-negative sparse Laplacian regularized latent multi-view subspace clustering

Journal of Algorithms & Computational Technology ◽

10.1177/17483026211024904 ◽

2021 ◽

Vol 15 ◽

pp. 174830262110249

Author(s):

Cong-Zhe You ◽

Zhen-Qiu Shu ◽

Hong-Hui Fan

Keyword(s):

Subspace Clustering ◽

Structural Features ◽

Space Representation ◽

Graph Regularization ◽

Structure Information ◽

Cluster Data ◽

Latent Space ◽

Low Dimensional ◽

Relationship Of ◽

The Relationship

Recently, in the area of artificial intelligence and machine learning, subspace clustering of multi-view data is a research hotspot. The goal is to divide data samples from different sources into different groups. We proposed a new subspace clustering method for multi-view data which termed as Non-negative Sparse Laplacian regularized Latent Multi-view Subspace Clustering (NSL2MSC) in this paper. The method proposed in this paper learns the latent space representation of multi view data samples, and performs the data reconstruction on the latent space. The algorithm can cluster data in the latent representation space and use the relationship of different views. However, the traditional representation-based method does not consider the non-linear geometry inside the data, and may lose the local and similar information between the data in the learning process. By using the graph regularization method, we can not only capture the global low dimensional structural features of data, but also fully capture the nonlinear geometric structure information of data. The experimental results show that the proposed method is effective and its performance is better than most of the existing alternatives.

Download Full-text

Reanalysis of nearby open clusters using Gaia DR1/TGAS and HSOY

Astronomy and Astrophysics ◽

10.1051/0004-6361/201731905 ◽

2018 ◽

Vol 615 ◽

pp. A12 ◽

Cited By ~ 12

Author(s):

Steffi X. Yen ◽

Sabine Reffert ◽

Elena Schilbach ◽

Siegfried Röser ◽

Nina V. Kharchenko ◽

...

Keyword(s):

Milky Way ◽

Open Cluster ◽

Star Clusters ◽

Open Clusters ◽

Parameter Estimates ◽

Cluster Membership ◽

Fitting Procedure ◽

Cluster Data ◽

The Galaxy ◽

Automatic Tool

Context. Open clusters have long been used to gain insights into the structure, composition, and evolution of the Galaxy. With the large amount of stellar data available for many clusters in the Gaia era, new techniques must be developed for analyzing open clusters, as visual inspection of cluster color-magnitude diagrams is no longer feasible. An automatic tool will be required to analyze large samples of open clusters. Aims. We seek to develop an automatic isochrone-fitting procedure to consistently determine cluster membership and the fundamental cluster parameters. Methods. Our cluster characterization pipeline first determined cluster membership with precise astrometry, primarily from TGAS and HSOY. With initial cluster members established, isochrones were fitted, using a χ2 minimization, to the cluster photometry in order to determine cluster mean distances, ages, and reddening. Cluster membership was also refined based on the stellar photometry. We used multiband photometry, which includes ASCC-2.5 BV, 2MASS JHKs, and Gaia G band. Results. We present parameter estimates for all 24 clusters closer than 333 pc as determined by the Catalogue of Open Cluster Data and the Milky Way Star Clusters catalog. We find that our parameters are consistent to those in the Milky Way Star Clusters catalog. Conclusions. We demonstrate that it is feasible to develop an automated pipeline that determines cluster parameters and membership reliably. After additional modifications, our pipeline will be able to use Gaia DR2 as input, leading to better cluster memberships and more accurate cluster parameters for a much larger number of clusters.

Download Full-text

FPGA-Accelerated Hadoop Cluster for Deep Learning Computations

2015 IEEE International Conference on Data Mining Workshop (ICDMW) ◽

10.1109/icdmw.2015.148 ◽

2015 ◽

Cited By ~ 6

Author(s):

Abdulrahman Alhamali ◽

Nibal Salha ◽

Raghid Morcel ◽

Mazen Ezzeddine ◽

Omar Hamdan ◽

...

Keyword(s):

Deep Learning ◽

Hadoop Cluster

Download Full-text