Invited Commentary: Standards, Inputs, and Outputs—Strategies for Improving Data-Sharing and Consortia-Based Epidemiologic Research

American Journal of Epidemiology ◽

10.1093/aje/kwab217 ◽

2021 ◽

Author(s):

James V Lacey, Jr. ◽

Jennifer L Benbow

Keyword(s):

Data Sharing ◽

Large Scale ◽

Data Consistency ◽

Flexible Tool ◽

Epidemiologic Research ◽

Online Tool ◽

Covariate Data ◽

Shared Data ◽

Technical Solutions ◽

Analytical Requirements

Abstract Data-sharing improves epidemiologic research, but the sharing of data frustrates epidemiologic researchers. The inefficiencies of current methods and options for data-sharing are increasingly documented and easily understood by any study group that has shared its data and any researcher who has received shared data. In this issue of the Journal, Temprosa et al. (Am J Epidemiol. XXX(XX):XXX–XXX) describe how the Consortium of Metabolomics Studies (COMETS) developed and deployed a flexible analytical platform to eliminate key pain points in large-scale metabolomics research. COMETS Analytics includes an online tool, but its cloud computing and technology are the supporting rather than the leading actors in this script. The COMETS team identified the need to standardize diverse and inconsistent metabolomics and covariate data and models across its many participating cohort studies, and then developed a flexible tool that gave its member studies choices about how they wanted to meet the consortium’s analytical requirements. Different specialties will have different specific research needs and will probably continue to use and develop an array of diverse analytical and technical solutions for their projects. COMETS Analytics shows how important—and enabling—the upstream attention to data standards and data consistency is to producing high-quality metabolomics, consortia-based, and large-scale epidemiology research.

Download Full-text

Technical solutions envisaged in managing solids in combined sewer networks

Water Science & Technology ◽

10.2166/wst.1996.0220 ◽

1996 ◽

Vol 33 (9) ◽

pp. 237-244 ◽

Cited By ~ 5

Author(s):

Ghassan Chebbo ◽

Dominique Laplace ◽

André Bachoc ◽

Yves Sanchez ◽

Benoit Le Guennec

Keyword(s):

Large Scale ◽

Suspended Solids ◽

Drainage Area ◽

Bed Load ◽

Urban Drainage ◽

Combined Sewer ◽

Wet Weather ◽

Technical Solutions ◽

Dry Weather Flow ◽

Sewer Networks

Solids in combined sewer networks represent two important technical questions: - the clogging of man-entry sewers, and - pollution in urban wet weather discharges, whose main vectors are generally suspended solids. In this paper, we shall present first, curative technical solutions which avoid or remove deposits in man-entry sewers. We shall discuss the partial extraction of the largest solids; selective trapping of bed load solids, which form deposits; and the displacement of deposits using dry weather flow flushing waves. We shall then examine technical solutions to control pollution in urban wet weather discharges. This will show that decantation is an efficient means of fighting pollution. However, it is not always feasible because it involves large scale investments. Complementary methods should, therefore, be developed and used at different points in the water's passage through an urban drainage area.

Download Full-text

Getting Started Creating Data Dictionaries: How to Create a Shareable Data Set

Advances in Methods and Practices in Psychological Science ◽

10.1177/2515245920928007 ◽

2021 ◽

Vol 4 (1) ◽

pp. 251524592092800

Author(s):

Erin M. Buchanan ◽

Sarah E. Crain ◽

Ari L. Cunningham ◽

Hannah R. Johnson ◽

Hannah Stash ◽

...

Keyword(s):

Data Collection ◽

Data Sharing ◽

Search Engine ◽

Web Applications ◽

Data Sets ◽

Data Dictionary ◽

Data Set ◽

Entire Process ◽

Shared Data ◽

Source Data

As researchers embrace open and transparent data sharing, they will need to provide information about their data that effectively helps others understand their data sets’ contents. Without proper documentation, data stored in online repositories such as OSF will often be rendered unfindable and unreadable by other researchers and indexing search engines. Data dictionaries and codebooks provide a wealth of information about variables, data collection, and other important facets of a data set. This information, called metadata, provides key insights into how the data might be further used in research and facilitates search-engine indexing to reach a broader audience of interested parties. This Tutorial first explains terminology and standards relevant to data dictionaries and codebooks. Accompanying information on OSF presents a guided workflow of the entire process from source data (e.g., survey answers on Qualtrics) to an openly shared data set accompanied by a data dictionary or codebook that follows an agreed-upon standard. Finally, we discuss freely available Web applications to assist this process of ensuring that psychology data are findable, accessible, interoperable, and reusable.

Download Full-text

CHORD: Distributed Data-sharing via Hybrid ROS 1 and 2 for Multi-robot Exploration of Large-scale Complex Environments

IEEE Robotics and Automation Letters ◽

10.1109/lra.2021.3061393 ◽

2021 ◽

pp. 1-1

Author(s):

Muhammad Fadhil Ginting ◽

Kyohei Otsu ◽

Jeffrey Edlund ◽

Jay Gao ◽

Ali-akbar Agha-mohammadi

Keyword(s):

Data Sharing ◽

Large Scale ◽

Distributed Data ◽

Complex Environments ◽

Multi Robot

Download Full-text

Toward a Faster Fault Tolerant Consensus to Maintain Data Consistency in Collaborative Environments

International Journal of Cooperative Information Systems ◽

10.1142/s0218843017500022 ◽

2017 ◽

Vol 26 (03) ◽

pp. 1750002

Author(s):

Fouad Hanna ◽

Lionel Droz-Bartholet ◽

Jean-Christophe Lapayre

Keyword(s):

Network Model ◽

Fault Tolerant ◽

Data Consistency ◽

Consensus Algorithm ◽

Simulation Platform ◽

Consensus Problem ◽

Collaborative Environments ◽

Consensus Algorithms ◽

Shared Data ◽

Simultaneous Process

The consensus problem has become a key issue in the field of collaborative telemedicine systems because of the need to guarantee the consistency of shared data. In this paper, we focus on the performance of consensus algorithms. First, we studied, in the literature, the most well-known algorithms in the domain. Experiments on these algorithms allowed us to propose a new algorithm that enhances the performance of consensus in different situations. During 2014, we presented our very first initial thoughts to enhance the performance of the consensus algorithms, but the proposed solution gave very moderate results. The goal of this paper is to present a new enhanced consensus algorithm, named Fouad, Lionel and J.-Christophe (FLC). This new algorithm was built on the architecture of the Mostefaoui-Raynal (MR) consensus algorithm and integrates new features and some known techniques in order to enhance the performance of consensus in situations where process crashes are present in the system. The results from our experiments running on the simulation platform Neko show that the FLC algorithm gives the best performance when using a multicast network model on different scenarios: in the first scenario, where there are no process crashes nor wrong suspicion, and even in the second one, where multiple simultaneous process crashes take place in the system.

Download Full-text

The Last Mile of M-Connected-Healthcare in the Covid Age: Data Sharing at Large Scale

2020 IEEE International Conference on Internet of Things and Intelligence System (IoTaIS) ◽

10.1109/iotais50849.2021.9359706 ◽

2021 ◽

Author(s):

Alberto Faro ◽

Daniela Giordano ◽

Mario Venticinque

Keyword(s):

Data Sharing ◽

Large Scale ◽

Last Mile

Download Full-text

A Wait-Free Multi-word Atomic (1,N) Register for Large-Scale Data Sharing on Multi-core Machines

2017 IEEE International Conference on Cluster Computing (CLUSTER) ◽

10.1109/cluster.2017.84 ◽

2017 ◽

Cited By ~ 1

Author(s):

Mauro Ianni ◽

Alessandro Pellegrini ◽

Francesco Quaglia

Keyword(s):

Data Sharing ◽

Large Scale ◽

Large Scale Data ◽

Scale Data

Download Full-text

Environmental Development Strategies for an Industrial Area (Kochi Industrial Belt)

Journal of Recent Activities in Architectural Sciences ◽

10.46610/joraas.2021.v06i02.003 ◽

2021 ◽

Vol 6 (2) ◽

Author(s):

Annu Reetha Thomas

Keyword(s):

Natural Environment ◽

Industrial Pollution ◽

Large Scale ◽

Human Life ◽

Industrial Area ◽

Health Issues ◽

Economic Progress ◽

Planning Approach ◽

Negative Impacts ◽

Technical Solutions

Discharging of wastes and toxic pollutants produced by the industrial activities into the natural environment which consist of air, water and land implies the term Industrial Pollution. It has serious consequences on human life and its health along with several ways of negative impacts on the environment and nature. As far as our nation is concerned most of the major cities are filled with these large-scale industries which place a crucial role financial development of a country. Strictly hindering the development of industries cannot be done as it is vital for the Socio-Economic progress of a country. Yet it is our duty to protect our natural environment by limiting the pollution due to industries. This Study consist of the issues occurred in Eloor- Kadungalloor region as result of the industrial pollution followed by policies for a development plan to enhance the natural and environmental conditions with a planning approach at micro study level. As far as the Kerala context is considered, the major spot which is mostly affected by the industrial pollution is the ‘Edayar Industrial belt’ which is the largest industrial belt in Kerala. This became one of most noted spot because of the continuous dumping of dangerous chemical pollutants from adjacent industries (pesticide and fertilizer manufacturing). It has also resulted in health issues for the inhabitants of the site. Though many complaints have been filed against the companies, there has no proper laws or schemes for taking measures for reduction of pollution have come up so far. Hence this paper deals with the application of technical solutions and strategies for an Environment Improvement plan development for an industrial as well as studying on the issues of sire and its inhabitants.

Download Full-text

Vitrification in pluripotent stem cell banking: Requirements and technical solutions for large-scale biobanks

Vitrification in Assisted Reproduction ◽

10.1201/b19316-29 ◽

2015 ◽

pp. 220-241

Keyword(s):

Stem Cell ◽

Pluripotent Stem Cell ◽

Large Scale ◽

Cell Banking ◽

Stem Cell Banking ◽

Technical Solutions

Download Full-text

Data Communities: Empowering Researcher-Driven Data Sharing in the Sciences

International Journal of Digital Curation ◽

10.2218/ijdc.v15i1.695 ◽

1970 ◽

Vol 15 (1) ◽

pp. 7

Author(s):

Rebecca Springer ◽

Danielle Cooper

Keyword(s):

Data Sharing ◽

Large Scale ◽

Data Repository ◽

Disciplinary Boundaries ◽

Success Stories ◽

Scholarly Communications ◽

Information Technologists ◽

Share Data ◽

Technological Intervention ◽

Informal Groups

There is a growing perception that science can progress more quickly, more innovatively, and more rigorously when researchers share data with each other. However many scientists are not engaging in data sharing and remain skeptical of its relevance to their work. As organizations and initiatives designed to promote STEM data sharing multiply – within, across, and outside academic institutions – there is a pressing need to decide strategically on the best ways to move forward. In this paper, we propose a new mechanism for conceptualizing and supporting STEM research data sharing.. Successful data sharing happens within data communities, formal or informal groups of scholars who share a certain type of data with each other, regardless of disciplinary boundaries. Drawing on the findings of four large-scale qualitative studies of research practices conducted by Ithaka S+R, as well as the scholarly literature, we identify what constitutes a data community and outline its most important features by studying three success stories, investigating the circumstances under which intensive data sharing is already happening. We contend that stakeholders who wish to promote data sharing – librarians, information technologists, scholarly communications professionals, and research funders, to name a few – should work to identify and empower emergent data communities. These are groups of scholars for whom a relatively straightforward technological intervention, usually the establishment of a data repository, could kickstart the growth of a more active data sharing culture. We conclude by offering recommendations for ways forward.

Download Full-text

PsychData – Experiences from 12 Years of Research Data Archiving

Septentrio Conference Series ◽

10.7557/5.3666 ◽

2015 ◽

Author(s):

Peter Weiland ◽

Ina Dehnhard

Keyword(s):

Data Sharing ◽

Large Scale ◽

Research Data ◽

Data Reuse ◽

German Research Foundation ◽

Cross Sectional ◽

Data Archiving ◽

Wide Range ◽

Domain Specific Knowledge ◽

Meta Analyses

See video of the presentation.The benefits of making research data permanently accessible through data archives is widely recognized: costs can be reduced by reusing existing data, research results can be compared and validated with results from archived studies, fraud can be more easily detected, and meta-analyses can be conducted. Apart from that, authors may gain recognition and reputation for producing the datasets. Since 2003, the accredited research data center PsychData (part of the Leibniz Institute for Psychology Information in Trier, Germany) documents and archives research data from all areas of psychology and related fields. In the beginning, the main focus was on datasets that provide a high potential for reuse, e.g. longitudinal studies, large-scale cross sectional studies, or studies that were conducted during historically unique conditions. Presently, more and more journal publishers and project funding agencies require researchers to archive their data and make them accessible for the scientific community. Therefore, PsychData also has to serve this need.In this presentation we report on our experiences in operating a discipline-specific research data archive in a domain where data sharing is met with considerable resistance. We will focus on the challenges for data sharing and data reuse in psychology, e.g.large amount of domain-specific knowledge necessary for data curationhigh costs for documenting the data because of a wide range on non-standardized measuressmall teams and little established infrastructures compared with the "big data" disciplinesstudies in psychology not designed for reuse (in contrast to the social sciences)data protectionresistance to sharing dataAt the end of the presentation, we will provide a brief outlook on DataWiz, a new project funded by the German Research Foundation (DFG). In this project, tools will be developed to support researchers in documenting their data during the research phase.

Download Full-text