small file
Recently Published Documents


TOTAL DOCUMENTS

63
(FIVE YEARS 12)

H-INDEX

7
(FIVE YEARS 1)

2021 ◽  
Author(s):  
Vijay Shankar Sharma ◽  
N.C Barwar

Now a day’s, Data is exponentially increasing with the advancement in the data science. Each and every digital footprint is generating enormous amount of data, which is further used for processing various tasks to generate important information for different end user applications. To handle such enormous amount of data, there are number of technologies available, Hadoop/HDFS is one of the big data handling technology. HDFS can easily handle the large files but when there is the case to deal with massive number of small files, the performance of the HDFS degrades. In this paper we have proposed a novel technique Hash Based Archive File (HBAF) that can solve the small file problem of the HDFS. The proposed technique is capable to read the final index files partly, that will reduce the memory load on the Name Node and offer the file appending capability after creation of the archiv.


2021 ◽  
Vol 10 (1) ◽  
pp. 57
Author(s):  
Daniel Cuevas-González ◽  
Juan Pablo García-Vázquez ◽  
Miguel Bravo-Zanoguera ◽  
Roberto López-Avitia ◽  
Marco A. Reyna ◽  
...  

In this paper, we propose investigating the ability to integrate a portable Electrocardiogram (ECG) device to commercial platforms to analyze and visualize information hosted in the cloud. Our ECG system based on the ADX8232 microchip was evaluated regarding its performance of recordings of a synthetic ECG signal for periods of 1, 2, 12, 24, and 36 h on six different cloud services to investigate whether it maintains reliable ECG records. Our results show that there are few cloud services capable of 24 h or longer ECG recordings. But some existing services are limited to small file sizes of less than 1,000,000 lines or 100 MB, or approximately 45 min of an ECG recording at a sampling rate of 360 Hz, making it difficult an extended time monitoring. Cloud platforms reveal some limitations of storage and visualization in order to provide support to health care specialists to access information related to a patient at any time.


Author(s):  
Anisha P Rodrigues ◽  
Roshan Fernandes ◽  
P. Vijaya ◽  
Satish Chander

Hadoop Distributed File System (HDFS) is developed to efficiently store and handle the vast quantity of files in a distributed environment over a cluster of computers. Various commodity hardware forms the Hadoop cluster, which is inexpensive and easily available. The large number of small files stored in HDFS consumed more memory which lags the performance because small files consumed heavy load on NameNode. Thus, the efficiency of indexing and accessing the small files on HDFS is improved by several techniques, such as archive files, New Hadoop Archive (New HAR), CombineFileInputFormat (CFIF), and Sequence file generation. The archive file combines the small files into single blocks. The new HAR file combines the smaller files into a single large file. The CFIF module merges the multiple files into a single split using NameNode, and the sequence file combines all the small files into a single sequence. The indexing and accessing of a small file in HDFS are evaluated using performance metrics, such as processing time and memory usage. The experiment shows that the sequence file generation approach is efficient when compared to other approaches concerning file access time is 1.5[Formula: see text]s, memory usage is 20 KB in multi-node, and the processing time is 0.1[Formula: see text]s.


Author(s):  
M. B. Masadeh ◽  
M. S. Azmi ◽  
S. S. S. Ahmad

Hadoop is an optimal solution for big data processing and storing since being released in the late of 2006, hadoop data processing stands on master-slaves manner [1] that’s splits the large file job into several small files in order to process them separately, this technique was adopted instead of pushing one large file into a costly super machine to insights some useful information. Hadoop runs very good with large file of big data, but when it comes to big data in small files it could facing some problems in performance, processing slow down, data access delay, high latency and up to a completely cluster shutting down [2]. In this paper we will high light on one of hadoop’s limitations, that’s affects the data processing performance, one of these limits called “big data in small files” accrued when a massive number of small files pushed into a hadoop cluster which will rides the cluster to shut down totally. This paper also high light on some native and proposed solutions for big data in small files, how do they work to reduce the negative effects on hadoop cluster, and add extra performance on storing and accessing mechanism.


Cloud computing being the extensive technology used across globe for data sharing. The data may vary from small file to a highly confidential file consisting of various sensitive information stored in it. Since the cloud services are provided by the third party vendors, users are very much concerned about the security and privacy of the data and data access details. The users wants their traceability to be hidden by the cloud vendors. The biggest challenge is to share the data in a most secured way by encrypting and also preserving the anonymity of the users in cloud from the vendors. This paper addresses the issue by proposing a multi attribute authority in key generations of users, where the few sub sets of attributes will be used by multiple attribute authorities randomly and hence masking of the selection of attributes from various authorities and providing a mechanism for efficient data distribution in cloud by preserving the anonymity of the users.


2020 ◽  
pp. 38-43
Author(s):  
V. V. Pankov ◽  
D. S. Pomerantsev

This article discusses the current state of standardization in the Russian Federation in the field of application of ultrasonic testing using a phased array. Attention is drawn to the problem of the lack of national standards (especially terminological), which clearly hinders the active development and implementation of the PA UT method in practice. The basic principles of the classical technology of phased arrays are analyzed. Electronic linear scanning, electronic scanning and combined scanning. Combined scanning has several advantages: a high probability of detecting defects, high scanning speed, quick setup and calibration, quick data analysis and small file size in comparison with conventional sector scanning. Some difficulties were noted when trying to work with classical DGS-curves using the synthetic focusing methods SAFT or FMC / TFM with beam focusing at each image reconstruction point.


2019 ◽  
Vol 32 (20) ◽  
Author(s):  
John Fragalla ◽  
Bill Loewe ◽  
Torben Kling Petersen
Keyword(s):  

2019 ◽  
Vol 56 (5) ◽  
pp. 725-731 ◽  
Author(s):  
Conor J. K. Blanchet ◽  
Eric J. Fish ◽  
Amy G. Miller ◽  
Laura A. Snyder ◽  
Julia D. Labadie ◽  
...  

Digital microscopy (DM) has been employed for primary diagnosis in human medicine and for research and teaching applications in veterinary medicine, but there are few veterinary DM validation studies. Region of interest (ROI) digital cytology is a subset of DM that uses image-stitching software to create a low-magnification image of a slide, then selected ROI at higher magnification, and stitches the images into a relatively small file of the embedded magnifications. This study evaluated the concordance of ROI-DM compared to traditional light microscopy (LM) between 2 blinded clinical pathologists. Sixty canine and feline cytology samples from a variety of anatomic sites, including 31 cases of malignant neoplasia, 15 cases of hyperplastic or benign neoplastic lesions, and 14 infectious/inflammatory lesions, were evaluated. Two separate nonblinded adjudicating clinical pathologists evaluated the reports and diagnoses and scored each paired case as fully concordant, partially concordant, or discordant. The average overall concordance (full and partial concordance) for both pathologists was 92%. Full concordance was significantly higher for malignant lesions than benign. For the 40 neoplastic lesions, ROI-DM and LM agreed on general category of tumor type in 78 of 80 cases (98%). ROI-DM cytology showed robust concordance with the current gold standard of LM cytology and is potentially a viable alternative to current LM cytology techniques.


Sign in / Sign up

Export Citation Format

Share Document