scholarly journals A Reliable Large Distributed Object Store Based Platform for Collecting Event Metadata

2021 ◽  
Vol 19 (3) ◽  
Author(s):  
Álvaro Fernández Casaní ◽  
Juan M. Orduña ◽  
Javier Sánchez ◽  
Santiago González de la Hoz

AbstractThe Large Hadron Collider (LHC) is about to enter its third run at unprecedented energies. The experiments at the LHC face computational challenges with enormous data volumes that need to be analysed by thousands of physics users. The ATLAS EventIndex project, currently running in production, builds a complete catalogue of particle collisions, or events, for the ATLAS experiment at the LHC. The distributed nature of the experiment data model is exploited by running jobs at over one hundred Grid data centers worldwide. Millions of files with petabytes of data are indexed, extracting a small quantity of metadata per event, that is conveyed with a data collection system in real time to a central Hadoop instance at CERN. After a successful first implementation based on a messaging system, some issues suggested performance bottlenecks for the challenging higher rates in next runs of the experiment. In this work we characterize the weaknesses of the previous messaging system, regarding complexity, scalability, performance and resource consumption. A new approach based on an object-based storage method was designed and implemented, taking into account the lessons learned and leveraging the ATLAS experience with this kind of systems. We present the experiment that we run during three months in the real production scenario worldwide, in order to evaluate the messaging and object store approaches. The results of the experiment show that the new object-based storage method can efficiently support large-scale data collection for big data environments like the next runs of the ATLAS experiment at the LHC.

Author(s):  
Masato Matsumoto ◽  
Kyle Ruske

<p>Condition ratings of bridge components in the Federal Highway Administration (FHWA)’s Structural Inventory and Appraisal database are determined by bridge inspectors in the field, often by visual confirmation or direct- contact sounding techniques. However, the determination of bridge condition ratings is generally subjective depending on individual inspectors’ knowledge and experience, as well as varying field conditions. There are also limitations to access, unsafe working conditions, and negative impacts of lane closures to account for. This paper describes an alternative method to obtaining informative and diagnostic inspection data for concrete bridge decks: mobile nondestructive bridge deck evaluation technology. The technology uses high- definition infrared and visual imaging to monitor bridge conditions over long-term (or desired) intervals. This combination of instruments benefits from rapid and large-scale data acquisition capabilities. Through its implementation in Japan over the course of two decades, the technology is opening new possibilities in a field with much untapped potential. Findings and lessons learned from our experience in the states of Virginia and Pennsylvania are described as examples of highway-speed mobile nondestructive evaluation in action. To validate the accuracy of delamination detection by the visual and infrared scanning, findings were proofed by physical sounding of the target deck structures.</p>


2020 ◽  
Vol 28 (4) ◽  
pp. 469-486 ◽  
Author(s):  
Jennifer Bussell

This article offers a description and discussion of “shadowing” as a data collection and analytic tool, highlighting potential research opportunities related to the direct observation of individuals—principally political elites—in their normal daily routine for an extended period of time, often between one day and one week. In contrast with large-scale data collection methods, including surveys, shadowing enables researchers to develop detailed observations of political behavior that are not limited by the availability of administrative data or the constraints of a questionnaire or an interview guide. Unlike more in-depth qualitative methods, such as ethnography, shadowing is scalable in a manner that allows for larger sample sizes and the potential for medium-N inference. I provide a detailed account of how to design and conduct a shadowing study, including sampling strategies, techniques for coding shadowing data, and processes for drawing inferences about the behavior of shadowed subjects, drawing on examples from a completed shadowing-based study. I also discuss ways to mitigate selection and observer biases, presenting results that suggest these can be no more pronounced when shadowing political elites than in other forms of observational research.


2019 ◽  
Vol 44 (3) ◽  
pp. 472-498
Author(s):  
Huy Quan Vu ◽  
Jian Ming Luo ◽  
Gang Li ◽  
Rob Law

Understanding the differences and similarities in the activities of tourists from various cultures is important for tourism managers to develop appropriate plans and strategies that could support urban tourism marketing and managements. However, tourism managers still face challenges in obtaining such understanding because the traditional approach of data collection, which relies on survey and questionnaires, is incapable of capturing tourist activities at a large scale. In this article, we present a method for the study of tourist activities based on a new type of data, venue check-ins. The effectiveness of the presented approach is demonstrated through a case study of a major tourism country, France. Analysis based on a large-scale data set from 19 tourism cities in France reveals interesting differences and similarities in the activities of tourists from 14 markets (countries). Valuable insights are provided for various urban tourism applications.


2015 ◽  
Vol 101 (4) ◽  
pp. 392-397 ◽  
Author(s):  
Trevor Duke ◽  
Edilson Yano ◽  
Adrian Hutchinson ◽  
Ilomo Hwaihwanje ◽  
Jimmy Aipit ◽  
...  

Although the WHO recommends all countries use International Classification of Diseases (ICD)-10 coding for reporting health data, accurate health facility data are rarely available in developing or low and middle income countries. Compliance with ICD-10 is extremely resource intensive, and the lack of real data seriously undermines evidence-based approaches to improving quality of care and to clinical and public health programme management. We developed a simple tool for the collection of accurate admission and outcome data and implemented it in 16 provincial hospitals in Papua New Guinea over 6 years. The programme was low cost and easy to use by ward clerks and nurses. Over 6 years, it gathered data on the causes of 96 998 admissions of children and 7128 deaths. National reports on child morbidity and mortality were produced each year summarising the incidence and mortality rates for 21 common conditions of children and newborns, and the lessons learned for policy and practice. These data informed the National Policy and Plan for Child Health, triggered the implementation of a process of clinical quality improvement and other interventions to reduce mortality in the neediest areas, focusing on diseases with the highest burdens. It is possible to collect large-scale data on paediatric morbidity and mortality, to be used locally by health workers who gather it, and nationally for improving policy and practice, even in very resource-limited settings where ICD-10 coding systems such as those that exist in some high-income countries are not feasible or affordable.


2018 ◽  
Author(s):  
M. Jason de la Cruz ◽  
Michael W. Martynowycz ◽  
Johan Hattne ◽  
Tamir Gonen

AbstractWe developed a procedure for the cryoEM method MicroED using SerialEM. With this approach, SerialEM coordinates stage rotation, microscope operation, and camera functions for automated continuous-rotation MicroED data collection. More than 300 datasets can be collected overnight in this way, facilitating high-throughput MicroED data collection for large-scale data analyses.


Sign in / Sign up

Export Citation Format

Share Document