data discovery
Recently Published Documents


TOTAL DOCUMENTS

326
(FIVE YEARS 118)

H-INDEX

17
(FIVE YEARS 3)

2021 ◽  
pp. e1232

Data Soup is a collaboration between the Journal of eScience Librarianship (JeSLIB) and the Data Curation Networkto host a series of community focused webinars/discussions to exchange practices for curating research data of different formats or subject areas among data curators. The lineup of the inaugural webinar includes the following speakers and topics from the recent JeSLIB Special Issue: Data Curation in Practice: Creating Guidance for Canadian Dataverse Curators: Portage Network’s Dataverse Curation Guide Alexandra Cooper, Michael Steeleworthy, Ève Paquette-Bigras, Erin Clary, Erin MacPherson, Louise Gillis, and Jason Brodeur, https://escholarship.umassmed.edu/jeslib/vol10/iss3/2; Active Curation of Large Longitudinal Surveys: A Case Study Inna Kouper, Karen L. Tucker, Kevin Tharp, Mary Ellen van Booven, and Ashley Clark, https://doi.org/10.7191/jeslib.2021.1210; Data Curation through Catalogs: A Repository-Independent Model for Data Discovery Helenmary Sheridan, Anthony J. Dellureficio, Melissa A. Ratajeski, Sara Mannheimer, and Terrie R. Wheeler, https://doi.org/10.7191/jeslib.2021.1203.


2021 ◽  
Author(s):  
Andrii Salnikov ◽  
Balázs Kónya

AbstractDistributed e-Infrastructure is a key component of modern BIG Science. Service discovery in e-Science environments, such as Worldwide LHC Computing Grid (WLCG), is a crucial functionality that relies on service registry. In this paper we re-formulate the requirements for the service endpoint registry based on our more than 10 years experience with many systems designed or used within the WLCG e-Infrastructure. To satisfy those requirements the paper proposes a novel idea to use the existing well-established Domain Name System (DNS) infrastructure together with a suitable data model as a service endpoint registry. The presented ARC Hierarchical Endpoints Registry (ARCHERY) system consists of a minimalistic data model representing services and their endpoints within e-Infrastructures, a rendering of the data model embedded into DNS-records, a lightweight software layer for DNS-record management and client-side data discovery. Our approach for the ARCHERY registry required minimal software development and inherits all the benefits of one of the most reliable distributed information discovery source of the internet, the DNS infrastructure. In particular, deployment, management and operation of ARCHERY is fully relying on DNS. Results of ARCHERY deployment use-cases are provided together with performance analysis.


2021 ◽  
Author(s):  
Sarah Bauermeister ◽  
Joshua R Bauermeister ◽  
R Bridgman ◽  
C Felici ◽  
M Newbury ◽  
...  

Abstract Research-ready data (that curated to a defined standard) increases scientific opportunity and rigour by integrating the data environment. The development of research platforms has highlighted the value of research-ready data, particularly for multi-cohort analyses. Following user consultation, a standard data model (C-Surv), optimised for data discovery, was developed using data from 12 Dementias Platform UK (DPUK) population and clinical cohort studies. The model uses a four-tier nested structure based on 18 data themes selected according to user behaviour or technology. Standard variable naming conventions are applied to uniquely identify variables within the context of longitudinal studies. The data model was used to develop a harmonised dataset for 11 cohorts. This dataset populated the Cohort Explorer data discovery tool for assessing the feasibility of an analysis prior to making a data access request. It was concluded that developing and applying a standard data model (C-Surv) for research cohort data is feasible and useful.


Author(s):  
Arvind Singh

Health care is one of the speedy growing areas. The Health care system contains large amount of medical data which should be mined from data warehouse. The mined data from data warehouse helps in finding the important information. Comprehensive amount of data in health care database need the growth of tools which can be used to access the data, analyze and analysis the data, discovery of knowledge, and versed use of the stored knowledge. The health care system has lot of data about the patient’s details, medications etc. In this paper we have studied different data mining and warehousing techniques used in healthcare areas.


Cell Genomics ◽  
2021 ◽  
Vol 1 (2) ◽  
pp. 100033
Author(s):  
L. Jonathan Dursi ◽  
Zoltan Bozoky ◽  
Richard de Borja ◽  
Haoyuan Li ◽  
David Bujold ◽  
...  
Keyword(s):  

2021 ◽  
Author(s):  
Kevin B Read ◽  
Heather Ganshorn ◽  
Sarah Rutley ◽  
David R. Scott

Background:As Canada increases requirements for research data management (RDM) and sharing, there is value in identifying how research data are shared, and what has been done to make them findable and reusable. This study aims to understand Canada’s data sharing landscape by reviewing how Canadian Institutes of Health Research (CIHR) funded data are shared, and comparing researchers’ data sharing practices to RDM and sharing best practices. Methods:We performed a descriptive analysis of CIHR-funded publications from PubMed and PubMed Central that were published between 1946 and Dec 31, 2019 and that indicated the research data underlying the results of the publication were shared. Each publication was analyzed to identify how and where data were shared, who shared data, and what documentation was included to support data reuse.Results:Of 4,144 CIHR-funded publications, 45.2% (n=1,876) included accessible data, 21.9% (n=909) stated data were available by request, 7.3% (n=304) stated data sharing was not applicable/possible, and we found no evidence of data sharing in 37.6% (n=1,558) of publications. Frequent data sharing methods included via a repository (n=1,549, 37.3%), within supplementary files (n=1,048, 25.2%), and by request (n=919, 22.1%). 13.1% (n=554) of publications included documentation that would facilitate data reuse.Interpretation:Our findings reveal that CIHR-funded publications largely lack the metadata, access instructions, and documentation to facilitate data discovery and reuse. Without measures to address these concerns, and enhanced support for researchers seeking to implement RDM and sharing best practices, most CIHR-funded research data will remain hidden, inaccessible, and unusable.


2021 ◽  
Author(s):  
Sarah Bauermeister ◽  
Joshua R Bauermeister ◽  
Ruth Bridgman ◽  
Caterina Felici ◽  
Mark Newbury ◽  
...  

Abstract Research-ready data (that curated to a defined standard) increases scientific opportunity and rigour by integrating the data environment. The development of research platforms has highlighted the value of research-ready data, particularly for multi-cohort analyses. Following user consultation, a standard data model (C-Surv), optimised for data discovery, was developed using data from 12 Dementias Platform UK (DPUK) population and clinical cohort studies. The model uses a four-tier nested structure based on 18 data themes selected according to user behaviour or technology. Standard variable naming conventions are applied to uniquely identify variables within the context of longitudinal studies. The data model was used to develop a harmonised dataset for 11 cohorts. This dataset populated the Cohort Explorer data discovery tool for assessing the feasibility of an analysis prior to making a data access request. It was concluded that developing and applying a standard data model (C-Surv) for research cohort data is feasible and useful.


2021 ◽  
Vol 58 (1) ◽  
pp. 610-612
Author(s):  
Ying‐Hsang Liu ◽  
Hsin‐Liang (Oliver) Chen ◽  
Makoto P. Kato ◽  
Mingfang Wu ◽  
Kathleen Gregory

Sign in / Sign up

Export Citation Format

Share Document