Reproducibility and efficiency in handling complex neurophysiological data

Abstract Preparing a neurophysiological data set with the aim of sharing and publishing is hard. Many of the available tools and services to provide a smooth workflow for data publication are still in their maturing stages and not well integrated. Also, best practices and concrete examples of how to create a rigorous and complete package of an electrophysiology experiment are still lacking. Given the heterogeneity of the field, such unifying guidelines and processes can only be formulated together as a community effort. One of the goals of the NFDI-Neuro consortium initiative is to build such a community for systems and behavioral neuroscience. NFDI-Neuro aims to address the needs of the community to make data management easier and to tackle these challenges in collaboration with various international initiatives (e.g., INCF, EBRAINS). This will give scientists the opportunity to spend more time analyzing the wealth of electrophysiological data they leverage, rather than dealing with data formats and data integrity.

Download Full-text

PDA Points to Consider: Best Practices for Document/Data Management and Control and Preparing for Data Integrity Inspections

PDA Journal of Pharmaceutical Science and Technology ◽

10.5731/pdajpst.2018.008573 ◽

2018 ◽

Vol 72 (3) ◽

pp. 332-337

Author(s):

Deb Autor ◽

Zena Kaufman ◽

Ron Tetzlaff ◽

Maryann Gribbin ◽

Madlene Dole ◽

...

Keyword(s):

Best Practices ◽

Data Management ◽

Data Integrity ◽

And Control

Download Full-text

Supporting open data: the key role of data managers

10.5194/egusphere-egu21-4055 ◽

2021 ◽

Author(s):

Alice Fremand

Keyword(s):

Data Management ◽

Data Science ◽

New Technologies ◽

Open Data ◽

Research Data ◽

Complex Data ◽

Data Set ◽

Data Publication ◽

World Data ◽

The World

Open data is not a new concept. Over sixty years ago in 1959, knowledge sharing was at the heart of the Antarctic Treaty which included in article III 1c the statement: &#8220;scientific observations and results from Antarctica shall be exchanged and made&#160;freely&#160;available&#8221;.&#160;&#8203;At a similar time, the World Data Centre (WDC) system was created to manage and distribute the data collected from the International Geophysical Year (1957-1958) led by the International Council of Science (ICSU) building the foundations of today&#8217;s research data management practices.What about now? The WDC system still exists through the World Data System (WDS). Open data has been endorsed by a majority of funders and stakeholders. Technology has dramatically evolved. And the profession of data manager/curator has emerged. Utilising their professional expertise means that their role is far wider than the long-term curation and publication of data sets.Data managers are involved in all stages of the data life cycle: from data management planning, data accessioning to data publication and re-use. They implement open data policies; help write data management plans and provide advice on how to manage data during, and beyond the life of, a science project. In liaison with software developers as well as scientists, they are developing new strategies to publish data either via data catalogues, via more sophisticated map-based viewer services or in machine-readable form via APIs. Often, they bring the expertise of the field they are working in to better assist scientists satisfy Findable, Accessible, Interoperable and Re-usable (FAIR) principles. Recent years have seen the development of a large community of experts that are essential to share, discuss and set new standards and procedures. The data are published to be re-used, and data managers are key to promoting high-quality datasets and participation in large data compilations.To date, there is no magical formula for FAIR data. The Research Data Alliance is a great platform allowing data managers and researchers to work together, develop and adopt infrastructure that promotes data-sharing and data-driven research. However, the challenge to properly describe each data set remains. Today, scientists are expecting more and more from their data publication or data requests: they want interactive maps, they want more complex data systems, they want to query data, combine data from different sources and publish them rapidly. &#160;By developing new procedures and standards, and looking at new technologies, data managers help set the foundations to data science.

Download Full-text

Pseudosparse neural coding in the visual system of primates

Communications Biology ◽

10.1038/s42003-020-01572-2 ◽

2021 ◽

Vol 4 (1) ◽

Author(s):

Sidney R. Lehky ◽

Keiji Tanaka ◽

Anne B. Sereno

Keyword(s):

Neural Coding ◽

Macaque Monkey ◽

Implicit Assumption ◽

Data Set ◽

Efficient Coding ◽

Population Responses ◽

Lateral Intraparietal Cortex ◽

Neural Populations ◽

Neurophysiological Data ◽

The Mean

AbstractWhen measuring sparseness in neural populations as an indicator of efficient coding, an implicit assumption is that each stimulus activates a different random set of neurons. In other words, population responses to different stimuli are, on average, uncorrelated. Here we examine neurophysiological data from four lobes of macaque monkey cortex, including V1, V2, MT, anterior inferotemporal cortex, lateral intraparietal cortex, the frontal eye fields, and perirhinal cortex, to determine how correlated population responses are. We call the mean correlation the pseudosparseness index, because high pseudosparseness can mimic statistical properties of sparseness without being authentically sparse. In every data set we find high levels of pseudosparseness ranging from 0.59–0.98, substantially greater than the value of 0.00 for authentic sparseness. This was true for synthetic and natural stimuli, as well as for single-electrode and multielectrode data. A model indicates that a key variable producing high pseudosparseness is the standard deviation of spontaneous activity across the population. Consistently high values of pseudosparseness in the data demand reconsideration of the sparse coding literature as well as consideration of the degree to which authentic sparseness provides a useful framework for understanding neural coding in the cortex.

Download Full-text

Whip: Communicate and Test What to Expect from Data

Biodiversity Information Science and Standards ◽

10.3897/biss.2.25317 ◽

2018 ◽

Vol 2 ◽

pp. e25317

Author(s):

Stijn Van Hoey ◽

Peter Desmet

Keyword(s):

Data Quality ◽

Controlled Vocabulary ◽

Published Data ◽

Data Set ◽

Data Publication ◽

Use Of Data ◽

Maximum Utility ◽

Group 2 ◽

Publication Guidelines ◽

Available Information

The ability to communicate and assess the quality and fitness for use of data is crucial to ensure maximum utility and re-use. Data consumers have certain requirements for the data they seek and need to be able to check if a data set conforms with these requirements. Data publishers aim to provide data with the highest possible quality and need to be able to identify potential errors that can be addressed with the available information at hand. The development and adoption of data publication guidelines is one approach to define and meet those requirements. However, the use of a guideline, the mapping decisions, and the requirements a dataset is expected to meet, are generally not communicated with the provided data. Moreover, these guidelines are typically intended for humans only. In this talk, we will present 'whip': a proposed syntax for data specifications. With whip, one can define column-based constraints for tabular (tidy) data using a number of rules, e.g. how data is structured following Darwin Core, how a term uses controlled vocabulary values, or what the expected minimum and maximum values are. These rules are human- and machine-readable, which communicates the specifications, and allows to automatically validate those in pipelines for data publication and quality assessment, such as Kurator. Whip can be formatted as a (yaml) text file that can be provided with the published data, communicating the specifications a dataset is expected to meet. The scope of these specifications can be specific to a dataset, but can also be used to express expected data quality and fitness for use of a publisher, consumer or community, allowing bottom-up and top-down adoption. As such, these specifications are complementary to the core set of data quality tests as currently under development by the TDWG Biodiversity Data Quality Task 2 Group 2. Whip rules are currently generic, but more specific ones can be defined to address requirements for biodiversity information.

Download Full-text

Making more research count: a blockchain enabled one-stop shop for immutable behavioral research

foresight ◽

10.1108/fs-03-2021-0062 ◽

2021 ◽

Vol ahead-of-print (ahead-of-print) ◽

Author(s):

Christian Hugo Hoffmann

Keyword(s):

Institutional Economics ◽

New Technologies ◽

Behavioral Science ◽

Data Integrity ◽

Original Data ◽

Scientific Data ◽

Data Set ◽

Content Type ◽

Blockchain Technology ◽

The Right

Purpose The purpose of this paper is to offer a panoramic view at the credibility issues that exist within social sciences research. Design/methodology/approach The central argument of this paper is that a joint effort between blockchain and other technologies such as artificial intelligence (AI) and deep learning and how they can prevent scientific data manipulation or data forgery as a way to make science more decentralized and anti-fragile, without losing data integrity or reputation as a trade-off. The authors address it by proposing an online research platform for use in social and behavioral science that guarantees data integrity through a combination of modern institutional economics and blockchain technology. Findings The benefits are mainly twofold: On the one hand, social science scholars get paired with the right target audience for their studies. On the other hand, a snapshot of the gathered data at the time of creation is taken so that researchers can prove that they used the original data set to peers in the future while maintaining full control of their data. Originality/value The proposed combination of behavioral economics with new technologies such as blockchain and AI is novel and translated into a cutting-edge tool to be implemented.

Download Full-text

Using Data Mining for Forecasting Data Management Needs

Data Warehousing and Mining ◽

10.4018/978-1-59904-951-9.ch124 ◽

2008 ◽

pp. 2088-2104

Author(s):

Qingyu Zhang ◽

Richard S. Segall

Keyword(s):

Data Mining ◽

Data Management ◽

Forest Cover ◽

Human Lung Cancer ◽

Production Environment ◽

Data Set ◽

Biotechnology Research ◽

Use Of Data ◽

Using Data ◽

Tools And Techniques

This chapter illustrates the use of data mining as a computational intelligence methodology for forecasting data management needs. Specifically, this chapter discusses the use of data mining with multidimensional databases for determining data management needs for the selected biotechnology data of forest cover data (63,377 rows and 54 attributes) and human lung cancer data set (12,600 rows of transcript sequences and 156 columns of gene types). The data mining is performed using four selected software of SAS® Enterprise MinerTM, Megaputer PolyAnalyst® 5.0, NeuralWare Predict®, and Bio- Discovery GeneSight®. The analysis and results will be used to enhance the intelligence capabilities of biotechnology research by improving data visualization and forecasting for organizations. The tools and techniques discussed here can be representative of those applicable in a typical manufacturing and production environment. Screen shots of each of the four selected software are presented, as are conclusions and future directions.

Download Full-text

Using Unmanned Surface Vehicles for Harbor Security and Disaster Mitigation and Relief: Special Topic 5: Best Practices in Sensor Design and Use, Systems Operations and Data Management

OCEANS 2019 MTS/IEEE SEATTLE ◽

10.23919/oceans40490.2019.8962657 ◽

2019 ◽

Author(s):

Stephen Ferretti ◽

Neil Zerbe

Keyword(s):

Best Practices ◽

Data Management ◽

Disaster Mitigation ◽

Special Topic ◽

Sensor Design ◽

Unmanned Surface Vehicles

Download Full-text

Project Initiation for Telemedicine Services

International Journal of Healthcare Information Systems and Informatics ◽

10.4018/ijhisi.2014040104 ◽

2014 ◽

Vol 9 (2) ◽

pp. 64-85 ◽

Cited By ~ 2

Author(s):

Cynthia M. LeRouge ◽

Bengisu Tulu ◽

Suzanne Wood

Keyword(s):

Project Management ◽

Best Practices ◽

Qualitative Data ◽

Business Models ◽

Healthcare Organizations ◽

Data Set ◽

Telemedicine Service ◽

Service Line ◽

Strategic Vision ◽

Management Literature

This study investigates project initiation for telemedicine, a technology innovation in healthcare organizations that manifests both intra- and inter-organizational collaboration. Moving from a telemedicine project to a sustainable telemedicine service line can be a challenge for many organizations (LeRouge, Tulu, & Forducey, 2010). Project definition (a.k.a., initiation) sets the strategic vision for a project and has been categorized as the most important stage in a project (C. Gray & Larson, 2008) and a key element for project success (Stah-Le Cardinal & Marle, 2006). Although project management best practices have been applied in many domains, there are few studies that link published best practices to the telemedicine domain. This study first presents a model, resulting from a review of project management literature that specifies the recommended components project definition. Using this model as a foundation, the authors explore how project definition is deployed in the telemedicine domain, using the instantiation of telestroke projects for this study. The authors base their findings on a multi-case qualitative data set, with each case representing a distinct telemedicine business model. Findings from this study explicate how the telestroke project initiation process is collaboratively managed and how this process impacts the overall success of the telemedicine programs through the lens of the five distinct telemedicine business models. Specifically, this study contributes insights on key elements of project initiation in the telemedicine context as well as the effects of the varying business models (focusing on commonalities and differences).

Download Full-text

Estimation of Irregular Wave Runup on Intermediate and Reflective Beaches Using a Phase-Resolving Numerical Model

Journal of Marine Science and Engineering ◽

10.3390/jmse8120993 ◽

2020 ◽

Vol 8 (12) ◽

pp. 993

Author(s):

Jonas Pinault ◽

Denis Morichon ◽

Volker Roeber

Keyword(s):

Time Series ◽

Best Practices ◽

Numerical Models ◽

Laboratory Data ◽

Irregular Waves ◽

Wave Transformation ◽

Model Parameters ◽

Wave Runup ◽

Data Set ◽

Promising Alternative

Accurate wave runup estimations are of great interest for coastal risk assessment and engineering design. Phase-resolving depth-integrated numerical models offer a promising alternative to commonly used empirical formulae at relatively low computational cost. Several operational models are currently freely available and have been extensively used in recent years for the computation of nearshore wave transformations and runup. However, recommendations for best practices on how to correctly utilize these models in computations of runup processes are still sparse. In this work, the Boussinesq-type model BOSZ is applied to calculate runup from irregular waves on intermediate and reflective beaches. The results are compared to an extensive laboratory data set of LiDAR measurements from wave transformation and shoreline elevation oscillations. The physical processes within the surf and swash zones such as the transfer from gravity to infragravity energy and dissipation are accurately accounted for. In addition, time series of the shoreline oscillations are well captured by the model. Comparisons of statistical values such as R2% show relative errors of less than 6%. The sensitivity of the results to various model parameters is investigated to allow for recommendations of best practices for modeling runup with phase-resolving depth-integrated models. While the breaking index is not found to be a key parameter for the examined cases, the grid size and the threshold depth, at which the runup is computed, are found to have significant influence on the results. The use of a time series, which includes both amplitude and phase information, is required for an accurate modeling of swash processes, as shown by computations with different sets of random waves, displaying a high variability and decreasing the agreement between the experiment and the model results substantially. The infragravity swash SIG is found to be sensitive to the initial phase distribution, likely because it is related to the short wave envelope.

Download Full-text

Learnings From Data Management and Integration Efforts on the Enbridge Pipeline System

2004 International Pipeline Conference, Volumes 1, 2, and 3 ◽

10.1115/ipc2004-0387 ◽

2004 ◽

Author(s):

Garry L. Sommer ◽

Brad S. Smith

Keyword(s):

Data Management ◽

Management Program ◽

Pipeline System ◽

Data Set ◽

Data Alignment ◽

Commercial Market ◽

History Of ◽

Integrity Management ◽

Management Effort ◽

Analysis System

Enbridge Pipelines Inc. operates one of the longest and most complex pipeline systems in the world. A key aspect of the Enbridge Integrity Management Program (IMP) is the trending, analysis, and management of data collected from over 50 years of pipeline operations. This paper/presentation describes Enbridge’s challenges, learnings, processes, and innovations for meeting today’s increased data management/integration demands. While much has been written around the premise of data management/integration, and many software solutions are available in the commercial market, the greatest data management challenge for mature pipeline operators arises from the variability of data (variety of technologies, data capture methods, and data accuracy levels) collected over the operating history of the system. Ability to bring this variable data set together is substantially the most difficult aspect of a coordinated data management effort and is critical to the success of any such project. Failure to do this will result in lack of user confidence and inability to gain “buy-in” to new data management processes. In 2001 Enbridge began a series of initiatives to enhance data management and analysis. Central to this was the commitment to accurate geospatial alignment of integrity data. This paper/presentation describes Enbridge’s experience with development of custom software (Integrated Spatial Analysis System – ISAS) including critical learnings around a.) Data alignment efforts and b.) Significant efforts involved in development of an accurate pipe centreline. The paper/presentation will also describe co-incident data management programs that link to ISAS. This includes enhanced database functionality for excavation data and development of software to enable electronic transfer of data to this database. These tools were built to enable rapid transfer of field data and “real time” tool validation through automated unity plots of tool defect data vs. that measured in the field.

Download Full-text