scholarly journals A plugin-based approach to data analysis for the AMS experiment on the ISS

2019 ◽  
Vol 214 ◽  
pp. 05038
Author(s):  
Valerio Formato

In many HEP experiments a typical data analysis workflow requires each user to read the experiment data in order to extract meaningful information and produce relevant plots for the considered analysis. Multiple users accessing the same data result in a redundant access to the data itself, which could be factorized effectively improving the CPU efficiency of the analysis jobs and relieving stress from the storage infrastructure. To address this issue we present a modular and lightweight solution where the users code is embedded in different "analysis plugins" which are then collected and loaded at runtime for execution, where the data is read only once and shared between all the different plugins. This solution was developed for one of the data analysis groups within the AMS collaboration but is easily extendable to all kinds of analyses and workloads that need I/O access on AMS data or custom data formats and can even adapted with little effort to another HEP experiment data. This framework could then be easily embedded into a "analysis train" and we will discuss a possible implementation and different ways to optimise CPU efficiency and execution time.

2021 ◽  
pp. 104453
Author(s):  
Alexiane Luc ◽  
Sébastien Le ◽  
Mathilde Philippe ◽  
El Mostafa Qannari ◽  
Evelyne Vigneau

Energies ◽  
2020 ◽  
Vol 13 (17) ◽  
pp. 4508
Author(s):  
Xin Li ◽  
Liangyuan Wang ◽  
Jemal H. Abawajy ◽  
Xiaolin Qin ◽  
Giovanni Pau ◽  
...  

Efficient big data analysis is critical to support applications or services in Internet of Things (IoT) system, especially for the time-intensive services. Hence, the data center may host heterogeneous big data analysis tasks for multiple IoT systems. It is a challenging problem since the data centers usually need to schedule a large number of periodic or online tasks in a short time. In this paper, we investigate the heterogeneous task scheduling problem to reduce the global task execution time, which is also an efficient method to reduce energy consumption for data centers. We establish the task execution for heterogeneous tasks respectively based on the data locality feature, which also indicate the relationship among the tasks, data blocks and servers. We propose a heterogeneous task scheduling algorithm with data migration. The core idea of the algorithm is to maximize the efficiency by comparing the cost between remote task execution and data migration, which could improve the data locality and reduce task execution time. We conduct extensive simulations and the experimental results show that our algorithm has better performance than the traditional methods, and data migration actually works to reduce th overall task execution time. The algorithm also shows acceptable fairness for the heterogeneous tasks.


2019 ◽  
Vol 1 (2) ◽  
pp. 627-645
Author(s):  
Nisa Umahmudah A ◽  
Sany Dwita ◽  
Nayang Helma Yunita

This study aims to test empirically about: 1) The influence of culture on the accountant's decision, and 2) the influence of religiousity effect on the accountant's decision. This type of research belongs to a quasi experiment. Data in this study were collected by using questionnaires on 200 accounting students from 2 universities in Padang City and 1 university in Madura. Data analysis was done by using two-way ANOVA. The results of this study conclude that culture affects an accountant in decision making, while religiousity does not affect the accountant's decision. This study focuses on Javanese culture and Minangkabau culture with a construal of self approach in assessing accountant decisions and using accounting students as a subject to examine cultural and religiousity influences on professional accountant decisions.


2021 ◽  
Author(s):  
Ivan Efremov ◽  
Roman Veselovskiy

<p>There are many programs for the analysis and visualization of paleomagnetic data, but each of them is good only in a certain use case and does not allow to perform a full cycle of paleomagnetic operations. Therefore, one has to resort to using a number of programs to complete the full path of processing paleomagnetic data. You often have to convert data from one format to another, manually vectorize charts, and generally spend more time and effort than could theoretically be spent. Thus, there is a long overdue need for a universal program capable of fast, convenient and high-quality performance of a full cycle of paleomagnetic operations. A set of programs written by Randy Enkin (Enkin, 1996) for DOS was taken as a time-tested example of such a program. The choice fell on them, since these programs (although they are very outdated) allow performing a full cycle of paleomagnetic operations and do it as conveniently and efficiently as possible for that time.</p><p>Our goal is to create a program devoid of all of the above disadvantages and capable of developing indefinitely as modular opensource software by the efforts of all people interested in this.</p><p>The result of our work is PMTools – a cross-platform software for statistical analysis and visualization of paleomagnetic data. PMTools supports all widely used paleomagnetic data formats and allows you to work with them simultaneously. All charts created in PMTools are vector, adapted for direct using in publications and presentations, and can be exported in both vector and raster formats. At the same time, PMTools implements a full cycle of routine paleomagnetic operations: from finding the best-fit directions to calculating the mean paleomagnetic poles. Moreover, all operations can be performed both with a mouse through a graphical user interface and with hotkeys, which significantly speeds up the data analysis process. </p><p>In the near future, PMTools will become a modular open source application, so that each user will be able to add its own modules, thereby expanding the program's functionality.</p><p><strong>References</strong></p><p>Enkin, R.J., 1996. A Computer Program Package for Analysis and Presentation of Paleomagnetic Data, Pacific Geoscience Center, Geological Survey of Canada, http://www.pgc.nrcan.gc.ca/tectonic/enkin.htm.</p>


2020 ◽  
Author(s):  
Waseem Hussain ◽  
Sankalp Bhosale ◽  
Margaret Catolos ◽  
Mahender Anumalla ◽  
Apurva Khanna ◽  
...  

Abstract Phenotypic data analysis is a key component in crop breeding to extract meaningful insights from data in making better breeding decisions. Each year the rainfed rice breeding (RRB) program at IRRI conducts trials in the national agricultural research and extension systems (NARES) network-partner sites across South Asia, Southeast Asia and Africa. Analyzing the data from the network trials and sharing the results with the partners in the best possible format is a daunting task. It is crucial to demystify data analysis to the NARES partners for making better breeding decisions. Here, we provide an overview of how RRB program at IRRI has leveraged R computational power with open-source resource tools like R Markdown, plotly , LaTeX and HTML to develop a unique data analysis workflow and redesigned it to a reproducible document for better interpretation, visualization and seamlessly sharing with partners. The generated report is the state-of-the-art implementation of analysis workflow and outputs either in text, tables or graphics in a unified way as one document. The analysis is highly reproducible and can be regenerated based at any time. The plots are built with enhanced dynamic and interactive visualizations to aid in better understanding and extract information with ease. Tables are highly interactive and manageable rendering liberty to be exported within the document in numerous formats. The source code and demo data set for download and use is available at https://github.com/whussain2/Analysis-pipeline . Conclusively, the analysis workflow and document we presented is not limited to IRRI’s RRB program but is applicable to any organization or institute with full-fledged breeding programs.


2020 ◽  
Vol 52 (8) ◽  
pp. 1049-1066
Author(s):  
Peter Filzmoser ◽  
Mariella Gregorich

AbstractOutliers are encountered in all practical situations of data analysis, regardless of the discipline of application. However, the term outlier is not uniformly defined across all these fields since the differentiation between regular and irregular behaviour is naturally embedded in the subject area under consideration. Generalized approaches for outlier identification have to be modified to allow the diligent search for potential outliers. Therefore, an overview of different techniques for multivariate outlier detection is presented within the scope of selected kinds of data frequently found in the field of geosciences. In particular, three common types of data in geological studies are explored: spatial, compositional and flat data. All of these formats motivate new outlier concepts, such as local outlyingness, where the spatial information of the data is used to define a neighbourhood structure. Another type are compositional data, which nicely illustrate the fact that some kinds of data require not only adaptations to standard outlier approaches, but also transformations of the data itself before conducting the outlier search. Finally, the very recently developed concept of cellwise outlyingness, typically used for high-dimensional data, allows one to identify atypical cells in a data matrix. In practice, the different data formats can be mixed, and it is demonstrated in various examples how to proceed in such situations.


MRS Advances ◽  
2020 ◽  
Vol 5 (29-30) ◽  
pp. 1577-1584
Author(s):  
Changwoo Do ◽  
Wei-Ren Chen ◽  
Sangkeun Lee

ABSTRACTSmall angle scattering (SAS) is a widely used technique for characterizing structures of wide ranges of materials. For such wide ranges of applications of SAS, there exist a large number of ways to model the scattering data. While such analysis models are often available from various suites of SAS data analysis software packages, selecting the right model to start with poses a big challenge for beginners to SAS data analysis. Here, we present machine learning (ML) methods that can assist users by suggesting scattering models for data analysis. A series of one-dimensional scattering curves have been generated by using different models to train the algorithms. The performance of the ML method is studied for various types of ML algorithms, resolution of the dataset, and the number of the dataset. The degree of similarities among selected scattering models is presented in terms of the confusion matrix. The scattering model suggestions with prediction scores provide a list of scattering models that are likely to succeed. Therefore, if implemented with extensive libraries of scattering models, this method can speed up the data analysis workflow by reducing search spaces for appropriate scattering models.


2014 ◽  
Author(s):  
Dean Keiswetter ◽  
Tom Furuya

2016 ◽  
Vol 140 (4) ◽  
pp. 3408-3408
Author(s):  
Kevin Williams ◽  
Michael L. Boyd ◽  
Alexander G. Soloway ◽  
Eric I. Thorsos ◽  
Steven G. Kargl ◽  
...  

Sign in / Sign up

Export Citation Format

Share Document