Single Vector Large Data Cardinality Structure to Handle Compressed Database in a Distributed Environment

Author(s):  
B. M. Monjurul Alom ◽  
Frans Henskens ◽  
Michael Hannaford
2017 ◽  
Vol 7 (1.1) ◽  
pp. 237
Author(s):  
MD. A R Quadri ◽  
B. Sruthi ◽  
A. D. SriRam ◽  
B. Lavanya

Java is one of the finest language for big data because of its write once and run anywhere nature. The new release of java 8 introduced few strategies like lambda expressions and streams which are helpful for parallel computing. Though these new strategies helps in extracting, sorting and filtering data from collections and arrays, still there are problems with it. Streams cannot properly process with the large data sets like big data. Also, there are few problems associated while executing in distributed environment. The new streams introduced in java are restricted to computations inside the single system there is no method for distributed computing over multiple systems. And streams store data in their memory and therefore cannot support huge data sets. Now, this paper cope with java 8 behalf of massive data and deed in distributed environment by providing extensions to the Programming model with distributed streams. The distributed computing of large data programming models may be consummated by introducing distributed stream frameworks.


2020 ◽  
Vol 8 (6) ◽  
pp. 1697-1706

In today’s scenario large enterprise world spread across different locations, continents or having a diverse presence over the globe, where data is enormous and handling such large data over distributed computing becomes critical in real-time database system. The transaction management system for the distributed environment must ensure that the sequence of updates in the stable warehouses in different locations is confirmed or cancelled safely as a single complete unit of work. Working with Real-Time Database System (RTDBS) and that to on a distributed computing system is a tough task. When we work with distributed environment over larger database, we need to take care of the transaction time period as well as number of transactions that are actually executed (committed) and number of transactions fail. The application on dynamic RTDBS becomes more complex when certain deadlines need to be completed. In this paper, we had carried out the test of CRUD (Create, Read, Update, and Delete) operation on transactional Real-time databases in real time dynamic distributed environment by using the existing EDF, GEDF algorithm and we had compared these algorithms with our proposed GMTDS algorithm in standalone and distributed environment with a dynamic self adaptive approach for management of transactions.


Author(s):  
John A. Hunt

Spectrum-imaging is a useful technique for comparing different processing methods on very large data sets which are identical for each method. This paper is concerned with comparing methods of electron energy-loss spectroscopy (EELS) quantitative analysis on the Al-Li system. The spectrum-image analyzed here was obtained from an Al-10at%Li foil aged to produce δ' precipitates that can span the foil thickness. Two 1024 channel EELS spectra offset in energy by 1 eV were recorded and stored at each pixel in the 80x80 spectrum-image (25 Mbytes). An energy range of 39-89eV (20 channels/eV) are represented. During processing the spectra are either subtracted to create an artifact corrected difference spectrum, or the energy offset is numerically removed and the spectra are added to create a normal spectrum. The spectrum-images are processed into 2D floating-point images using methods and software described in [1].


Author(s):  
Thomas W. Shattuck ◽  
James R. Anderson ◽  
Neil W. Tindale ◽  
Peter R. Buseck

Individual particle analysis involves the study of tens of thousands of particles using automated scanning electron microscopy and elemental analysis by energy-dispersive, x-ray emission spectroscopy (EDS). EDS produces large data sets that must be analyzed using multi-variate statistical techniques. A complete study uses cluster analysis, discriminant analysis, and factor or principal components analysis (PCA). The three techniques are used in the study of particles sampled during the FeLine cruise to the mid-Pacific ocean in the summer of 1990. The mid-Pacific aerosol provides information on long range particle transport, iron deposition, sea salt ageing, and halogen chemistry.Aerosol particle data sets suffer from a number of difficulties for pattern recognition using cluster analysis. There is a great disparity in the number of observations per cluster and the range of the variables in each cluster. The variables are not normally distributed, they are subject to considerable experimental error, and many values are zero, because of finite detection limits. Many of the clusters show considerable overlap, because of natural variability, agglomeration, and chemical reactivity.


Author(s):  
Hakan Ancin

This paper presents methods for performing detailed quantitative automated three dimensional (3-D) analysis of cell populations in thick tissue sections while preserving the relative 3-D locations of cells. Specifically, the method disambiguates overlapping clusters of cells, and accurately measures the volume, 3-D location, and shape parameters for each cell. Finally, the entire population of cells is analyzed to detect patterns and groupings with respect to various combinations of cell properties. All of the above is accomplished with zero subjective bias.In this method, a laser-scanning confocal light microscope (LSCM) is used to collect optical sections through the entire thickness (100 - 500μm) of fluorescently-labelled tissue slices. The acquired stack of optical slices is first subjected to axial deblurring using the expectation maximization (EM) algorithm. The resulting isotropic 3-D image is segmented using a spatially-adaptive Poisson based image segmentation algorithm with region-dependent smoothing parameters. Extracting the voxels that were labelled as "foreground" into an active voxel data structure results in a large data reduction.


1980 ◽  
Vol 19 (04) ◽  
pp. 187-194
Author(s):  
J.-Ph. Berney ◽  
R. Baud ◽  
J.-R. Scherrer

It is well known that Frame Selection Systems (FFS) have proved both popular and effective in physician-machine and patient-machine dialogue. A formal algorithm for definition of a Frame Selection System for handling man-machine dialogue is presented here. Besides, it is shown how the natural medical language can be handled using the approach of a tree branching logic. This logic appears to be based upon ordered series of selections which enclose a syntactic structure. The external specifications are discussed with regard to convenience and efficiency. Knowing that all communication between the user and the application programmes is handled only by FSS software, FSS contributes to achieving modularity and, therefore, also maintainability in a transaction-oriented system with a large data base and concurrent accesses.


2020 ◽  
Vol 39 (5) ◽  
pp. 6419-6430
Author(s):  
Dusan Marcek

To forecast time series data, two methodological frameworks of statistical and computational intelligence modelling are considered. The statistical methodological approach is based on the theory of invertible ARIMA (Auto-Regressive Integrated Moving Average) models with Maximum Likelihood (ML) estimating method. As a competitive tool to statistical forecasting models, we use the popular classic neural network (NN) of perceptron type. To train NN, the Back-Propagation (BP) algorithm and heuristics like genetic and micro-genetic algorithm (GA and MGA) are implemented on the large data set. A comparative analysis of selected learning methods is performed and evaluated. From performed experiments we find that the optimal population size will likely be 20 with the lowest training time from all NN trained by the evolutionary algorithms, while the prediction accuracy level is lesser, but still acceptable by managers.


Author(s):  
Vivek Raich ◽  
Pankaj Maurya

in the time of the Information Technology, the big data store is going on. Due to which, Huge amounts of data are available for decision makers, and this has resulted in the progress of information technology and its wide growth in many areas of business, engineering, medical, and scientific studies. Big data means that the size which is bigger in size, but there are several types, which are not easy to handle, technology is required to handle it. Due to continuous increase in the data in this way, it is important to study and manage these datasets by adjusting the requirements so that the necessary information can be obtained.The aim of this paper is to analyze some of the analytic methods and tools. Which can be applied to large data. In addition, the application of Big Data has been analyzed, using the Decision Maker working on big data and using enlightened information for different applications.


Sign in / Sign up

Export Citation Format

Share Document