scholarly journals Implementation of Network Cards Optimizations in Hadoop Cluster Data Transmissions

2017 ◽  
Vol 4 (12) ◽  
pp. 153506 ◽  
Author(s):  
Okta Nurika ◽  
Mohd Fadzil Hassan ◽  
Nordin Zakaria
Keyword(s):  
2018 ◽  
Vol 6 (1) ◽  
pp. 41-48
Author(s):  
Santoso Setiawan

Abstract   Inaccurate stock management will lead to high and uneconomical storage costs, as there may be a void or surplus of certain products. This will certainly be very dangerous for all business people. The K-Means method is one of the techniques that can be used to assist in designing an effective inventory strategy by utilizing the sales transaction data that is already available in the company. The K-Means algorithm will group the products sold into several large transactional data clusters, so it is expected to help entrepreneurs in designing stock inventory strategies.   Keywords: inventory, k-means, product transaction data, rapidminer, data mining   Abstrak   Manajemen stok yang tidak akurat akan menyebabkan biaya penyimpanan yang tinggi dan tidak ekonomis, karena kemungkinan terjadinya kekosongan atau kelebihan produk tertentu. Hal ini sangat berbahaya bagi para pelaku bisnis. Metode K-Means adalah salah satu teknik yang dapat digunakan untuk membantu dalam merancang strategi persediaan yang efektif dengan memanfaatkan data transaksi penjualan yang telah tersedia di perusahaan. Algoritma K-Means akan mengelompokkan produk yang dijual ke beberapa cluster data transaksi yang umumnya besar, sehingga diharapkan dapat membantu pengusaha dalam merancang strategi persediaan stok.   Kata kunci: data transaksi produk, k-means, persediaan, rapidminer, data mining.


2020 ◽  
Vol 13 (4) ◽  
pp. 790-797
Author(s):  
Gurjit Singh Bhathal ◽  
Amardeep Singh Dhiman

Background: In current scenario of internet, large amounts of data are generated and processed. Hadoop framework is widely used to store and process big data in a highly distributed manner. It is argued that Hadoop Framework is not mature enough to deal with the current cyberattacks on the data. Objective: The main objective of the proposed work is to provide a complete security approach comprising of authorisation and authentication for the user and the Hadoop cluster nodes and to secure the data at rest as well as in transit. Methods: The proposed algorithm uses Kerberos network authentication protocol for authorisation and authentication and to validate the users and the cluster nodes. The Ciphertext-Policy Attribute- Based Encryption (CP-ABE) is used for data at rest and data in transit. User encrypts the file with their own set of attributes and stores on Hadoop Distributed File System. Only intended users can decrypt that file with matching parameters. Results: The proposed algorithm was implemented with data sets of different sizes. The data was processed with and without encryption. The results show little difference in processing time. The performance was affected in range of 0.8% to 3.1%, which includes impact of other factors also, like system configuration, the number of parallel jobs running and virtual environment. Conclusion: The solutions available for handling the big data security problems faced in Hadoop framework are inefficient or incomplete. A complete security framework is proposed for Hadoop Environment. The solution is experimentally proven to have little effect on the performance of the system for datasets of different sizes.


2017 ◽  
Vol 108 ◽  
pp. 1622-1631
Author(s):  
Marco Strutz ◽  
Hermann Heßling ◽  
Achim streit

2021 ◽  
Vol 15 ◽  
pp. 174830262110249
Author(s):  
Cong-Zhe You ◽  
Zhen-Qiu Shu ◽  
Hong-Hui Fan

Recently, in the area of artificial intelligence and machine learning, subspace clustering of multi-view data is a research hotspot. The goal is to divide data samples from different sources into different groups. We proposed a new subspace clustering method for multi-view data which termed as Non-negative Sparse Laplacian regularized Latent Multi-view Subspace Clustering (NSL2MSC) in this paper. The method proposed in this paper learns the latent space representation of multi view data samples, and performs the data reconstruction on the latent space. The algorithm can cluster data in the latent representation space and use the relationship of different views. However, the traditional representation-based method does not consider the non-linear geometry inside the data, and may lose the local and similar information between the data in the learning process. By using the graph regularization method, we can not only capture the global low dimensional structural features of data, but also fully capture the nonlinear geometric structure information of data. The experimental results show that the proposed method is effective and its performance is better than most of the existing alternatives.


2018 ◽  
Vol 615 ◽  
pp. A12 ◽  
Author(s):  
Steffi X. Yen ◽  
Sabine Reffert ◽  
Elena Schilbach ◽  
Siegfried Röser ◽  
Nina V. Kharchenko ◽  
...  

Context. Open clusters have long been used to gain insights into the structure, composition, and evolution of the Galaxy. With the large amount of stellar data available for many clusters in the Gaia era, new techniques must be developed for analyzing open clusters, as visual inspection of cluster color-magnitude diagrams is no longer feasible. An automatic tool will be required to analyze large samples of open clusters. Aims. We seek to develop an automatic isochrone-fitting procedure to consistently determine cluster membership and the fundamental cluster parameters. Methods. Our cluster characterization pipeline first determined cluster membership with precise astrometry, primarily from TGAS and HSOY. With initial cluster members established, isochrones were fitted, using a χ2 minimization, to the cluster photometry in order to determine cluster mean distances, ages, and reddening. Cluster membership was also refined based on the stellar photometry. We used multiband photometry, which includes ASCC-2.5 BV, 2MASS JHKs, and Gaia G band. Results. We present parameter estimates for all 24 clusters closer than 333 pc as determined by the Catalogue of Open Cluster Data and the Milky Way Star Clusters catalog. We find that our parameters are consistent to those in the Milky Way Star Clusters catalog. Conclusions. We demonstrate that it is feasible to develop an automated pipeline that determines cluster parameters and membership reliably. After additional modifications, our pipeline will be able to use Gaia DR2 as input, leading to better cluster memberships and more accurate cluster parameters for a much larger number of clusters.


Author(s):  
Abdulrahman Alhamali ◽  
Nibal Salha ◽  
Raghid Morcel ◽  
Mazen Ezzeddine ◽  
Omar Hamdan ◽  
...  
Keyword(s):  

Sign in / Sign up

Export Citation Format

Share Document