Indexing Blocks to Reduce Space and Time Requirements for Searching Large Data Files

Author(s):  
Tzuhsien Wu ◽  
Hao Shyng ◽  
Jerry Chou ◽  
Bin Dong ◽  
Kesheng Wu
Author(s):  
Bill Trevillion

Abstract Radian Corporation has developed extensive data display capabilities to analyze vibration and acoustic data from structures and rotating equipment. The Machinery Interactive Display and Analysis System (MIDAS) displays data collected through the acquisition functions of MIDAS. The graphics capabilities include displaying spectra in three-dimensional waterfall and in X-Y formats. Both types of plots can relate vibrations to time, equipment speed, or process parameters. Using menu-driven parameter selection, data can be displayed in formats that are the most useful for analysis. The system runs on a popular mini-computer, and it can be used with a great variety of graphics terminals, workstations, and printer/plotters. The software was designed and written for interactive display and plotting. Automatic plotting of large data files is facilitated by a batch plotting mode. The user can define display formats for the analysis of noise and vibration problems in the electric utility, chemical processing, paper, and automotive industries. This paper describes the history and development of graphics capabilities of the MIDAS system. The system, as illustrated in the examples, has proven efficient and economical for displaying large quantities of data.


2016 ◽  
Vol 13 (1) ◽  
pp. 181-186 ◽  
Author(s):  
Dawna M. Drum ◽  
Andrew Pulvermacher

ABSTRACT Modern organizations are inundated with data, and they often struggle to organize it in an efficient and effective manner in order to get the most value from the data. The context of this case is, thus, situated in current business practice. Students are given large data files that were extracted from an enterprise system. They must use Microsoft Access and Excel to summarize and organize the data to create a dynamic profit and loss statement. Basic skills in Excel and general accounting knowledge are assumed, while Access knowledge is not assumed. The Teaching Notes provide solutions and are organized to allow instructors to provide minimal guidance or fully annotated directions.


1997 ◽  
Vol 3 (S2) ◽  
pp. 931-932 ◽  
Author(s):  
Ian M. Anderson ◽  
Jim Bentley

Recent developments in instrumentation and computing power have greatly improved the potential for quantitative imaging and analysis. For example, products are now commercially available that allow the practical acquisition of spectrum images, where an EELS or EDS spectrum can be acquired from a sequence of positions on the specimen. However, such data files typically contain megabytes of information and may be difficult to manipulate and analyze conveniently or systematically. A number of techniques are being explored for the purpose of analyzing these large data sets. Multivariate statistical analysis (MSA) provides a method for analyzing the raw data set as a whole. The basis of the MSA method has been outlined by Trebbia and Bonnet.MSA has a number of strengths relative to other methods of analysis. First, it is broadly applicable to any series of spectra or images. Applications include characterization of grain boundary segregation (position-), of channeling-enhanced microanalysis (orientation-), or of beam damage (time-variation of spectra).


2019 ◽  
Vol 16 (9) ◽  
pp. 3824-3829
Author(s):  
Deepak Ahlawat ◽  
Deepali Gupta

Due to advancement in the technological world, there is a great surge in data. The main sources of generating such a large amount of data are social websites, internet sites etc. The large data files are combined together to create a big data architecture. Managing the data file in such a large volume is not easy. Therefore, modern techniques are developed to manage bulk data. To arrange and utilize such big data, Hadoop Distributed File System (HDFS) architecture from Hadoop was presented in the early stage of 2015. This architecture is used when traditional methods are insufficient to manage the data. In this paper, a novel clustering algorithm is implemented to manage a large amount of data. The concepts and frames of Big Data are studied. A novel algorithm is developed using the K means and cosine-based similarity clustering in this paper. The developed clustering algorithm is evaluated using the precision and recall parameters. The prominent results are obtained which successfully manages the big data issue.


1990 ◽  
Vol 73 (7) ◽  
pp. 1945-1955 ◽  
Author(s):  
V. Ducrocq ◽  
D. Boichard ◽  
B. Bonaiti ◽  
A. Barbat ◽  
M. Briend

2017 ◽  
Author(s):  
Saima Sultana Tithi ◽  
Roderick V. Jensen ◽  
Liqing Zhang

AbstractIdentifying viruses and phages in a metagenomics sample has important implication in improving human health, preventing viral outbreaks, and developing personalized medicine. With the rapid increase in data files generated by next generation sequencing, existing tools for identifying and annotating viruses and phages in metagenomics samples suffer from expensive running time. In this paper, we developed a stand-alone pipeline, FastViromeExplorer, for rapid identification and abundance quantification of viruses and phages in big metagenomic data. Both real and simulated data validated FastViromeExplorer as a reliable tool to accurately identify viruses and their abundances in large data, as well as in a time efficient manner.


Sign in / Sign up

Export Citation Format

Share Document