scholarly journals mapMECFS: a portal to enhance data discovery across biological disciplines and collaborative sites

2021 ◽  
Vol 19 (1) ◽  
Author(s):  
Ravi Mathur ◽  
Megan U. Carnes ◽  
Alexander Harding ◽  
Amy Moore ◽  
Ian Thomas ◽  
...  

Abstract Background Myalgic encephalomyelitis/chronic fatigue syndrome (ME/CFS) is a debilitating disease which involves multiple body systems (e.g., immune, nervous, digestive, circulatory) and research domains (e.g., immunology, metabolomics, the gut microbiome, genomics, neurology). Despite several decades of research, there are no established ME/CFS biomarkers available to diagnose and treat ME/CFS. Sharing data and integrating findings across these domains is essential to advance understanding of this complex disease by revealing diagnostic biomarkers and facilitating discovery of novel effective therapies. Methods The National Institutes of Health funded the development of a data sharing portal to support collaborative efforts among an initial group of three funded research centers. This was subsequently expanded to include the global ME/CFS research community. Using the open-source comprehensive knowledge archive network (CKAN) framework as the base, the ME/CFS Data Management and Coordinating Center developed an online portal with metadata collection, smart search capabilities, and domain-agnostic data integration to support data findability and reusability while reducing the barriers to sustainable data sharing. Results We designed the mapMECFS data portal to facilitate data sharing and integration by allowing ME/CFS researchers to browse, share, compare, and download molecular datasets from within one data repository. At the time of publication, mapMECFS contains data curated from public data repositories, peer-reviewed publications, and current ME/CFS Research Network members. Conclusions mapMECFS is a disease-specific data portal to improve data sharing and collaboration among ME/CFS researchers around the world. mapMECFS is accessible to the broader research community with registration. Further development is ongoing to include novel systems biology and data integration methods.

2021 ◽  
Author(s):  
Ravi Mathur ◽  
Megan U. Carnes ◽  
Alexander Harding ◽  
Amy Moore ◽  
Ian Thomas ◽  
...  

Abstract Background Myalgic encephalomyelitis/chronic fatigue syndrome (ME/CFS) is a debilitating disease which involves multiple body systems (e.g., immune, nervous, digestive, circulatory) and research domains (e.g., immunology, metabolomics, the gut microbiome, genomics, neurology). Despite several decades of research, there are no established ME/CFS biomarkers available to diagnose and treat ME/CFS. Sharing data and integrating findings across these domains is essential to advance understanding of this complex disease by revealing diagnostic biomarkers and facilitating discovery of novel effective therapies. Methods The National Institutes of Health funded the development of a data sharing portal to support collaborative efforts among an initial group of three funded research centers. This was subsequently expanded to include the global ME/CFS research community. Using the open-source comprehensive knowledge archive network (CKAN) framework as the base, the ME/CFS Data Management and Coordinating Center developed targeted metadata collection, smart search capabilities, and domain-agnostic data integration to support data findability and reusability while reducing the barriers to sustainable data sharing. Results We designed the mapMECFS data portal to facilitate data sharing and integration by allowing ME/CFS researchers to browse, share, compare, and download molecular datasets from within one data repository. At the time of publication, mapMECFS contains data curated from public data repositories, peer-reviewed publications, and current ME/CFS network researchers. Conclusions mapMECFS is a disease-specific data portal to improve data sharing and collaboration among ME/CFS researchers around the world. mapMECFS is accessible to the broader research community with registration. Further development is ongoing to include novel systems biology and data integration methods.


2021 ◽  
Vol ahead-of-print (ahead-of-print) ◽  
Author(s):  
Jennifer L. Thoegersen ◽  
Pia Borlund

PurposeThe purpose of this paper is to report a study of how research literature addresses researchers' attitudes toward data repository use. In particular, the authors are interested in how the term data sharing is defined, how data repository use is reported and whether there is need for greater clarity and specificity of terminology.Design/methodology/approachTo study how the literature addresses researcher data repository use, relevant studies were identified by searching Library Information Science and Technology Abstracts, Library and Information Science Source, Thomas Reuters' Web of Science Core Collection and Scopus. A total of 62 studies were identified for inclusion in this meta-evaluation.FindingsThe study shows a need for greater clarity and consistency in the use of the term data sharing in future studies to better understand the phenomenon and allow for cross-study comparisons. Furthermore, most studies did not address data repository use specifically. In most analyzed studies, it was not possible to segregate results relating to sharing via public data repositories from other types of sharing. When sharing in public repositories was mentioned, the prevalence of repository use varied significantly.Originality/valueResearchers' data sharing is of great interest to library and information science research and practice to inform academic libraries that are implementing data services to support these researchers. This study explores how the literature approaches this issue, especially the use of data repositories, the use of which is strongly encouraged. This paper identifies the potential for additional study focused on this area.


2018 ◽  
Vol 42 (1) ◽  
pp. 124-142 ◽  
Author(s):  
Youngseek Kim ◽  
Seungahn Nah

Purpose The purpose of this paper is to examine how data reuse experience, attitudinal beliefs, social norms, and resource factors influence internet researchers to share data with other researchers outside their teams. Design/methodology/approach An online survey was conducted to examine the extent to which data reuse experience, attitudinal beliefs, social norms, and resource factors predicted internet researchers’ data sharing intentions and behaviors. The theorized model was tested using a structural equation modeling technique to analyze a total of 201 survey responses from the Association of Internet Researchers mailing list. Findings Results show that data reuse experience significantly influenced participants’ perception of benefit from data sharing and participants’ norm of data sharing. Belief structures regarding data sharing, including perceived career benefit and risk, and perceived effort, had significant associations with attitude toward data sharing, leading internet researchers to have greater data sharing intentions and behavior. The results also reveal that researchers’ norms for data sharing had a direct effect on data sharing intention. Furthermore, the results indicate that, while the perceived availability of data repository did not yield a positive impact on data sharing intention, it has a significant, direct, positive impact on researchers’ data sharing behaviors. Research limitations/implications This study validated its novel theorized model based on the theory of planned behavior (TPB). The study showed a holistic picture of how different data sharing factors, including data reuse experience, attitudinal beliefs, social norms, and data repositories, influence internet researchers’ data sharing intentions and behaviors. Practical implications Data reuse experience, attitude toward and norm of data sharing, and the availability of data repository had either direct or indirect influence on internet researchers’ data sharing behaviors. Thus, professional associations, funding agencies, and academic institutions alike should promote academic cultures that value data sharing in order to create a virtuous cycle of reciprocity and encourage researchers to have positive attitudes toward/norms of data sharing; these cultures should be strengthened by the strong support of data repositories. Originality/value In line with prior scholarship concerning scientific data sharing, this study of internet researchers offers a map of scientific data sharing intentions and behaviors by examining the impacts of data reuse experience, attitudinal beliefs, social norms, and data repositories together.


2020 ◽  
Vol 33 ◽  
pp. 01003
Author(s):  
Wouter Haak ◽  
Alberto Zigoni ◽  
Helen Kardinaal-de Mooij ◽  
Elena Zudilova-Seinstra

Institutions, funding bodies, and national research organizations are pushing for more data sharing and FAIR data. Institutions typically implement data policies, frequently supported by an institutional data repository. Funders typically mandate data sharing. So where does this leave the researcher? How can researchers benefit from doing the additional work to share their data? In order to make sure that researchers and institutions get credit for sharing their data, the data needs to be tracked and attributed first. In this paper we investigated where the research data ended up for 11 research institutions, and how this data is currently tracked and attributed. Furthermore, we also analysed the gap between the research data that is currently in institutional repositories, and where their researchers truly share their data. We found that 10 out of 11 institutions have most of their public research data hosted outside of their own institution. Combined, they have 12% of their institutional research data published in the institutional data repositories. According to our data, the typical institution had 5% of their research data (median) published in the institutional repository, but there were 4 universities for which it was 10% or higher. By combining existing data-to-article graphs with existing article-to- researcher and article-to-institution graphs it becomes possible to increase tracking of public research data and therefore the visibility of researchers sharing their data typically by 17x. The tracking algorithm that was used to perform analysis and report on potential improvements has subsequently been implemented as a standard method in the Mendeley Data Monitor product. The improvement is most likely an under-estimate because, while the recall for datasets in institutional repositories is 100%, that is not the case for datasets published outside the institutions, so there are even more datasets still to be discovered.


2021 ◽  
Author(s):  
Leslie A Lenert ◽  
Andrey V. Ilatovskiy ◽  
James Agnew ◽  
Patricia Rudsill ◽  
Jeff Jacobs ◽  
...  

AbstractObjectiveObjective: The COVID-19 pandemic has enhanced the need for timely real-world data (RWD) for research. To meet this need, several large clinical consortia have developed networks for access to RWD from electronic health records (EHR), each with its own common data model (CDM) and custom pipeline for extraction, transformation, and load operations for production and incremental updating. However, the demands of COVID-19 research for timely RWD (e.g., 2-week delay) make this less feasible.Methods and MaterialsWe describe the use of the Fast Healthcare Interoperability Resource (FHIR) data model as a canonical model for representation of clinical data for automated transformation to the Patient-Centered Outcomes Research Network (PCORnet) and Observational Medical Outcomes Partnership (OMOP) CDMs and the near automated production of linked clinical data repositories (CDRs) for COVID-19 research using the FHIR subscription standard. The approach was applied to healthcare data from a large academic institution and was evaluated using published quality assessment tools.ResultsSix years of data (1.07M patients, 10.1M encounters, 137M laboratory results), were loaded into the FHIR CDR producing 3 linked real-time linked repositories: FHIR, PCORnet, and OMOP. PCORnet and OMOP databases were refined in subsequent post processing steps into production releases and met published quality standards. The approach greatly reduced CDM production efforts.ConclusionsFHIR and FHIR CDRs can play an important role in enhancing the availability of RWD from EHR systems. The above approach leverages 21st Century Cures Act mandated standards and could greatly enhance the availability of datasets for research.


Author(s):  
Jesus M. Gonzalez-Barahona ◽  
Daniel Izquierdo-Cortazar ◽  
Megan Squire

Empirical research on software development based on data obtained from project repositories and code forges is increasingly gaining attention in the software engineering research community. The studies in this area typically start by retrieving or monitoring some subset of data found in the repository or forge, and this data is later analyzed to find interesting patterns. However, retrieving information from these locations can be a challenging task. Meta-repositories providing public information about software development are useful tools that can simplify and streamline the research process. Public data repositories that collect and clean the data from other project repositories or code forges can help ensure that research studies are based on good quality data. This paper provides some insight as to how these meta-repositories (sometimes called a “repository of repositories”, RoR) of data about open source projects should be used to help researchers. This paper describes in detail two of the most widely used collections of data about software development: FLOSSmole and FLOSSMetrics.


2017 ◽  
Vol 35 (4) ◽  
pp. 626-649 ◽  
Author(s):  
Wei Jeng ◽  
Daqing He ◽  
Yu Chi

Purpose Owing to the recent surge of interest in the age of the data deluge, the importance of researching data infrastructures is increasing. The open archival information system (OAIS) model has been widely adopted as a framework for creating and maintaining digital repositories. Considering that OAIS is a reference model that requires customization for actual practice, this paper aims to examine how the current practices in a data repository map to the OAIS environment and functional components. Design/methodology/approach The authors conducted two focus-group sessions and one individual interview with eight employees at the world’s largest social science data repository, the Interuniversity Consortium for Political and Social Research (ICPSR). By examining their current actions (activities regarding their work responsibilities) and IT practices, they studied the barriers and challenges of archiving and curating qualitative data at ICPSR. Findings The authors observed that the OAIS model is robust and reliable in actual service processes for data curation and data archives. In addition, a data repository’s workflow resembles digital archives or even digital libraries. On the other hand, they find that the cost of preventing disclosure risk and a lack of agreement on the standards of text data files are the most apparent obstacles for data curation professionals to handle qualitative data; the maturation of data metrics seems to be a promising solution to several challenges in social science data sharing. Originality/value The authors evaluated the gap between a research data repository’s current practices and the adoption of the OAIS model. They also identified answers to questions such as how current technological infrastructure in a leading data repository such as ICPSR supports their daily operations, what the ideal technologies in those data repositories would be and the associated challenges that accompany these ideal technologies. Most importantly, they helped to prioritize challenges and barriers from the data curator’s perspective and to contribute implications of data sharing and reuse in social sciences.


2015 ◽  
Vol 24 (02) ◽  
pp. 1540008 ◽  
Author(s):  
Albert Weichselbraun ◽  
Daniel Streiff ◽  
Arno Scharl

Linking named entities to structured knowledge sources paves the way for state-of-the-art Web intelligence applications which assign sentiment to the correct entities, identify trends, and reveal relations between organizations, persons and products. For this purpose this paper introduces Recognyze, a named entity linking component that uses background knowledge obtained from linked data repositories, and outlines the process of transforming heterogeneous data silos within an organization into a linked enterprise data repository which draws upon popular linked open data vocabularies to foster interoperability with public data sets. The presented examples use comprehensive real-world data sets from Orell Füssli Business Information, Switzerland's largest business information provider. The linked data repository created from these data sets comprises more than nine million triples on companies, the companies' contact information, key people, products and brands. We identify the major challenges of tapping into such sources for named entity linking, and describe required data pre-processing techniques to use and integrate such data sets, with a special focus on disambiguation and ranking algorithms. Finally, we conduct a comprehensive evaluation based on business news from the New Journal of Zurich and AWP Financial News to illustrate how these techniques improve the performance of the Recognyze named entity linking component.


Biomolecules ◽  
2021 ◽  
Vol 11 (7) ◽  
pp. 961
Author(s):  
Paula Fernandez-Guerra ◽  
Ana C. Gonzalez-Ebsen ◽  
Susanne E. Boonen ◽  
Julie Courraud ◽  
Niels Gregersen ◽  
...  

Myalgic encephalomyelitis/chronic fatigue syndrome (ME/CFS) is a heterogeneous, debilitating, and complex disease. Along with disabling fatigue, ME/CFS presents an array of other core symptoms, including autonomic nervous system (ANS) dysfunction, sustained inflammation, altered energy metabolism, and mitochondrial dysfunction. Here, we evaluated patients' symptomatology and the mitochondrial metabolic parameters in peripheral blood mononuclear cells (PBMCs) and plasma from a clinically well-characterised cohort of six ME/CFS patients compared to age- and gender-matched controls. We performed a comprehensive cellular assessment using bioenergetics (extracellular flux analysis) and protein profiles (quantitative mass spectrometry-based proteomics) together with self-reported symptom measures of fatigue, ANS dysfunction, and overall physical and mental well-being. This ME/CFS cohort presented with severe fatigue, which correlated with the severity of ANS dysfunction and overall physical well-being. PBMCs from ME/CFS patients showed significantly lower mitochondrial coupling efficiency. They exhibited proteome alterations, including altered mitochondrial metabolism, centred on pyruvate dehydrogenase and coenzyme A metabolism, leading to a decreased capacity to provide adequate intracellular ATP levels. Overall, these results indicate that PBMCs from ME/CFS patients have a decreased ability to fulfill their cellular energy demands.


Sign in / Sign up

Export Citation Format

Share Document