A New Application for the Management of the MongoDB Servers

2013 ◽  
Vol 2 (3) ◽  
pp. 58-71
Author(s):  
Tudorica Bogdan George

The application presented in the following subsections intends to cover one of the noticeable gaps of the NoSQL domain, namely the relative lack of working tools and systems administration for new large data storage systems. Following the comparative analysis of the NoSQL solutions on the market, the MongoDB system was chosen as target application for this step of development, for reasons mainly related to proven performance, flexibility, market presence already in place and ease of use.

Author(s):  
Alexander Thomasian

Data storage requirements have consistently increased over time. According to the latest WinterCorp survey (http://www/WinterCorp.com), “The size of the world’s largest databases has tripled every two years since 2001.” With database size in excess of 1 terabyte, there is a clear need for storage systems that are both cost effective and highly reliable. Historically, large databases are implemented on mainframe systems. These systems are large and expensive to purchase and maintain. In recent years, large data warehouse applications are being deployed on Linux and Windows hosts, as replacements for the existing mainframe systems. These systems are significantly less expensive to purchase while requiring less resources to run and maintain. With large databases it is less feasible, and less cost effective, to use tapes for backup and restore. The time required to copy terabytes of data from a database to a serial medium (streaming tape) is measured in hours, which would significantly degrade performance and decreases availability. Alternatives to serial backup include local replication, mirroring, or geoplexing of data. The increasing demands of larger databases must be met by less expensive disk storage systems, which are yet highly reliable and less susceptible to data loss. This article is organized into five sections. The first section provides background information that serves to introduce the concepts of disk arrays. The following three sections detail the concepts used to build complex storage systems. The focus of these sections is to detail: (i) Redundant Arrays of Independent Disks (RAID) arrays; (ii) multilevel RAID (MRAID); (iii) concurrency control and storage transactions. The conclusion contains a brief survey of modular storage prototypes.


2019 ◽  
Vol 91 ◽  
pp. 08018 ◽  
Author(s):  
Olga V. Mamoutova ◽  
Svetlana V. Shirokova ◽  
Mikhail B. Uspenskij ◽  
Aleksandra V. Loginova

Monitoring and diagnosing the state of data storage systems, as well as assessing reliability and troubleshooting, require a formalized health model. A comparative analysis of existing knowledge representation methods has shown that an ontological approach is well suited for this task. This paper introduces a machine-represented data storage reliability ontology with an expert health model as baseline data. Classes of the ontology include the key terms of the reliability domain. Stated requirements for data interpretation tools allow further processing of the ontology-based knowledge base. Described ontology-based diagnostic systems have shown their applicability in the case of data storage systems in the construction industry.


2019 ◽  
Vol 135 ◽  
pp. 04076 ◽  
Author(s):  
Marina Bolsunovskaya ◽  
Svetlana Shirokova ◽  
Aleksandra Loginova

This paper is devoted to the problem of developing and application of data storage systems (DSS) and tools for managing such systems to predict failures and provide fault tolerance specifications. Nowadays DSS are widely used for collecting data in Smart Home and Smart Cites management systems. For example, large data warehouses are utilized in traffic management systems. The results of the current data storage market state analysis are shown, and the project the purpose of which is to develop a hardware and software complex to predict failures in the storage system is presented.


Big data applications play an important role in real time data processing. Apache Spark is a data processing framework with in-memory data engine that quickly processes large data sets. It can also distribute data processing tasks across multiple computers, either on its own or in tandem with other distributed computing tools. Spark’s in-memory processing cannot share data between the applications and hence, the RAM memory will be insufficient for storing petabytes of data. Alluxio is a virtual distributed storage system that leverages memory for data storage and provides faster access to data in different storage systems. Alluxio helps to speed up data intensive Spark applications, with various storage systems. In this work, the performance of applications on Spark as well as Spark running over Alluxio have been studied with respect to several storage formats such as Parquet, ORC, CSV, and JSON; and four types of queries from Star Schema Benchmark (SSB). A benchmark is evolved to suggest the suitability of Spark Alluxio combination for big data applications. It is found that Alluxio is suitable for applications that use databases of size more than 2.6 GB storing data in JSON and CSV formats. Spark is found suitable for applications that use storage formats such as parquet and ORC with database sizes less than 2.6GB.


Author(s):  
T. A. Dodson ◽  
E. Völkl ◽  
L. F. Allard ◽  
T. A. Nolan

The process of moving to a fully digital microscopy laboratory requires changes in instrumentation, computing hardware, computing software, data storage systems, and data networks, as well as in the operating procedures of each facility. Moving from analog to digital systems in the microscopy laboratory is similar to the instrumentation projects being undertaken in many scientific labs. A central problem of any of these projects is to create the best combination of hardware and software to effectively control the parameters of data collection and then to actually acquire data from the instrument. This problem is particularly acute for the microscopist who wishes to "digitize" the operation of a transmission or scanning electron microscope. Although the basic physics of each type of instrument and the type of data (images & spectra) generated by each are very similar, each manufacturer approaches automation differently. The communications interfaces vary as well as the command language used to control the instrument.


Author(s):  
D. V. Gribanov

Introduction. This article is devoted to legal regulation of digital assets turnover, utilization possibilities of distributed computing and distributed data storage systems in activities of public authorities and entities of public control. The author notes that some national and foreign scientists who study a “blockchain” technology (distributed computing and distributed data storage systems) emphasize its usefulness in different activities. Data validation procedure of digital transactions, legal regulation of creation, issuance and turnover of digital assets need further attention.Materials and methods. The research is based on common scientific (analysis, analogy, comparing) and particular methods of cognition of legal phenomena and processes (a method of interpretation of legal rules, a technical legal method, a formal legal method and a formal logical one).Results of the study. The author conducted an analysis which resulted in finding some advantages of the use of the “blockchain” technology in the sphere of public control which are as follows: a particular validation system; data that once were entered in the system of distributed data storage cannot be erased or forged; absolute transparency of succession of actions while exercising governing powers; automatic repeat of recurring actions. The need of fivefold validation of exercising governing powers is substantiated. The author stresses that the fivefold validation shall ensure complex control over exercising of powers by the civil society, the entities of public control and the Russian Federation as a federal state holding sovereignty over its territory. The author has also conducted a brief analysis of judicial decisions concerning digital transactions.Discussion and conclusion. The use of the distributed data storage system makes it easier to exercise control due to the decrease of risks of forge, replacement or termination of data. The author suggests defining digital transaction not only as some actions with digital assets, but also as actions toward modification and addition of information about legal facts with a purpose of its establishment in the systems of distributed data storage. The author suggests using the systems of distributed data storage for independent validation of information about activities of the bodies of state authority. In the author’s opinion, application of the “blockchain” technology may result not only in the increase of efficiency of public control, but also in the creation of a new form of public control – automatic control. It is concluded there is no legislation basis for regulation of legal relations concerning distributed data storage today.


2020 ◽  
Vol 96 (3s) ◽  
pp. 55-59
Author(s):  
А.Г. Зуев ◽  
С.С. Махов

Проведен краткий обзор структуры систем хранения данных в развитии от традиционных подходов до технологий, определивших современный вид систем хранения данных. Рассмотрены обобщенная структура и разновидности ее компонентов. Выделены основные требования, определяющие подходы к проектированию систем хранения данных. The paper presents a brief survey of data storage systems in its evolution from traditional approaches to technologies which have defined actual types of data storage. Besides, it considers a general structure and variety of its components, and lists main requirements defining design approaches.


Sign in / Sign up

Export Citation Format

Share Document