FORMATION OF METASUBJECT COMPETENCIES IN THE COURSE “INFORMATION TECHNOLOGIES” BY MEANS OF THE BIG DATA PROCESSING LANGUAGE R

Author(s):  
D. M. Nazarov

The article describes the training methods in the course “Information Technologies” for the future bachelors of the directions “Economics”, “Management”, “Finance”, “Business Informatics”, the development of metasubject competencies of the student while his use of tools for data processing by means of the language R. The metasubject essence of the work is to update traditional economic knowledge and skills through various presentation forms of the same data sets. As part of the laboratory work described in the article, future bachelors learn to use the basic tools of the R language and acquire specific skills and abilities in R-Studio using the example of processing currency exchange data. The description of the methods is presented in the form of the traditional Key-by-Key technology, which is widely used in teaching information technologies.

2021 ◽  
pp. 000276422110216
Author(s):  
Kazimierz M. Slomczynski ◽  
Irina Tomescu-Dubrow ◽  
Ilona Wysmulek

This article proposes a new approach to analyze protest participation measured in surveys of uneven quality. Because single international survey projects cover only a fraction of the world’s nations in specific periods, researchers increasingly turn to ex-post harmonization of different survey data sets not a priori designed as comparable. However, very few scholars systematically examine the impact of the survey data quality on substantive results. We argue that the variation in source data, especially deviations from standards of survey documentation, data processing, and computer files—proposed by methodologists of Total Survey Error, Survey Quality Monitoring, and Fitness for Intended Use—is important for analyzing protest behavior. In particular, we apply the Survey Data Recycling framework to investigate the extent to which indicators of attending demonstrations and signing petitions in 1,184 national survey projects are associated with measures of data quality, controlling for variability in the questionnaire items. We demonstrate that the null hypothesis of no impact of measures of survey quality on indicators of protest participation must be rejected. Measures of survey documentation, data processing, and computer records, taken together, explain over 5% of the intersurvey variance in the proportions of the populations attending demonstrations or signing petitions.


Author(s):  
David Japikse ◽  
Oleg Dubitsky ◽  
Kerry N. Oliphant ◽  
Robert J. Pelton ◽  
Daniel Maynes ◽  
...  

In the course of developing advanced data processing and advanced performance models, as presented in companion papers, a number of basic scientific and mathematical questions arose. This paper deals with questions such as uniqueness, convergence, statistical accuracy, training, and evaluation methodologies. The process of bringing together large data sets and utilizing them, with outside data supplementation, is considered in detail. After these questions are focused carefully, emphasis is placed on how the new models, based on highly refined data processing, can best be used in the design world. The impact of this work on designs of the future is discussed. It is expected that this methodology will assist designers to move beyond contemporary design practices.


Author(s):  
Abou_el_ela Abdou Hussein

Day by day advanced web technologies have led to tremendous growth amount of daily data generated volumes. This mountain of huge and spread data sets leads to phenomenon that called big data which is a collection of massive, heterogeneous, unstructured, enormous and complex data sets. Big Data life cycle could be represented as, Collecting (capture), storing, distribute, manipulating, interpreting, analyzing, investigate and visualizing big data. Traditional techniques as Relational Database Management System (RDBMS) couldn’t handle big data because it has its own limitations, so Advancement in computing architecture is required to handle both the data storage requisites and the weighty processing needed to analyze huge volumes and variety of data economically. There are many technologies manipulating a big data, one of them is hadoop. Hadoop could be understand as an open source spread data processing that is one of the prominent and well known solutions to overcome handling big data problem. Apache Hadoop was based on Google File System and Map Reduce programming paradigm. Through this paper we dived to search for all big data characteristics starting from first three V's that have been extended during time through researches to be more than fifty six V's and making comparisons between researchers to reach to best representation and the precise clarification of all big data V’s characteristics. We highlight the challenges that face big data processing and how to overcome these challenges using Hadoop and its use in processing big data sets as a solution for resolving various problems in a distributed cloud based environment. This paper mainly focuses on different components of hadoop like Hive, Pig, and Hbase, etc. Also we institutes absolute description of Hadoop Pros and cons and improvements to face hadoop problems by choosing proposed Cost-efficient Scheduler Algorithm for heterogeneous Hadoop system.


Author(s):  
V. A. Sizov ◽  
A. D. Kirov

The article is devoted to the problem of developing an analytical data processing system for monitoring information security within the information security management system of modern companies conducting their main activities in cyberspace and using cloud infrastructure. Based on the analysis of modern information technologies related to ensuring information security of cloud infrastructure and the most popular products for ensuring information security of cloud infrastructures, as well as existing scientific approaches, a formalized approach to the synthesis of an analytical data processing system for monitoring the information security of an informatization object using cloud infrastructure is proposed. This approach takes into account the usefulness of the used information technologies from the viewpoint of information security. A general model of the structure of information support of an analytical data processing system for monitoring information security, as well as a model of the dependence of the usefulness of information technology on time and the ratio of the skill level of an information security specialist and an attacker are presented. The quality of the information security monitoring system is used as a criterion in the first optimization model. The following limitations are suggested: limitation on the time of making a decision on an incident; limitation on the degree of quality of analysis of information security events by the analytical data processing system and limitation on the compatibility of data analysis functions with data types about information security events. The cited results of the study of the second model show a logically consistent dependence of the usefulness of information technology on time and the ratio of the skill level of an information security specialist to the skill level of an attacker. The particular models of the structure of the information support of ASOD are presented. They make it possible to determine the rational structure information support of ASOD according to particular criteria. The following particular criteria are used: the maximin criterion of the usefulness of the information support of ASOD for monitoring the information security of an informatization object in the cloud infrastructure; the criterion for the maximum relevance of information support distributed over the nodes of the cloud infrastructure for systems with a low degree of centralization of management.


2022 ◽  
Vol 55 (1) ◽  
Author(s):  
Nie Zhao ◽  
Chunming Yang ◽  
Fenggang Bian ◽  
Daoyou Guo ◽  
Xiaoping Ouyang

In situ synchrotron small-angle X-ray scattering (SAXS) is a powerful tool for studying dynamic processes during material preparation and application. The processing and analysis of large data sets generated from in situ X-ray scattering experiments are often tedious and time consuming. However, data processing software for in situ experiments is relatively rare, especially for grazing-incidence small-angle X-ray scattering (GISAXS). This article presents an open-source software suite (SGTools) to perform data processing and analysis for SAXS and GISAXS experiments. The processing modules in this software include (i) raw data calibration and background correction; (ii) data reduction by multiple methods; (iii) animation generation and intensity mapping for in situ X-ray scattering experiments; and (iv) further data analysis for the sample with an order degree and interface correlation. This article provides the main features and framework of SGTools. The workflow of the software is also elucidated to allow users to develop new features. Three examples are demonstrated to illustrate the use of SGTools for dealing with SAXS and GISAXS data. Finally, the limitations and future features of the software are also discussed.


2018 ◽  
Vol 10 (3) ◽  
pp. 76-90
Author(s):  
Ye Tao ◽  
Xiaodong Wang ◽  
Xiaowei Xu

This article describes how rapidly growing data volumes require systems that have the ability to handle massive heterogeneous unstructured data sets. However, most existing mature transaction processing systems are built upon relational databases with structured data. In this article, the authors design a hybrid development framework, to offer greater scalability and flexibility of data analysis and reporting, while keeping maximum compatibility and links to the legacy platforms on which transaction business logics run. Data, service and user interfaces are implemented as a toolset stack, for developing applications with functionalities of information retrieval, data processing, analyzing and visualizing. A use case of healthcare data integration is presented as an example, where information is collected and aggregated from diverse sources. The workflow and simulation of data processing and visualization are also discussed, to validate the effectiveness of the proposed framework.


2019 ◽  
Vol 52 (2) ◽  
pp. 472-477 ◽  
Author(s):  
Feng Yu ◽  
Qisheng Wang ◽  
Minjun Li ◽  
Huan Zhou ◽  
Ke Liu ◽  
...  

With the popularity of hybrid pixel array detectors, hundreds of diffraction data sets are collected at a biological macromolecular crystallography (MX) beamline every day. Therefore, the manual processing and recording procedure will be a very time-consuming and error-prone task. Aquarium is an automatic data processing and experiment information management system designed for synchrotron radiation source MX beamlines. It is composed of a data processing module, a daemon module and a web site module. Before experiments, the sample information can be registered into a database. The daemon module will submit data processing jobs to a high-performance cluster as soon as the data set collection is completed. The data processing module will automatically process data sets from data reduction to model building if the anomalous signal is available. The web site module can be used to monitor and inspect the data processing results.


Sign in / Sign up

Export Citation Format

Share Document