data repositories Latest Research Papers

Key Factors Affecting User Adoption of Open-Access Data Repositories in Intelligence and Security Informatics: An Affordance Perspective

ACM Transactions on Management Information Systems ◽

10.1145/3460823 ◽

2022 ◽

Vol 13 (1) ◽

pp. 1-24

Author(s):

Bo Wen ◽

Paul Jen-Hwa Hu ◽

Mohammadreza Ebrahimi ◽

Hsinchun Chen

Keyword(s):

Open Access ◽

Factor Model ◽

Ease Of Use ◽

Perceived Usefulness ◽

Survey Study ◽

Data Repositories ◽

Factors Affecting ◽

Open Access Data ◽

Security Informatics ◽

Access Data

Rich, diverse cybersecurity data are critical for efforts by the intelligence and security informatics (ISI) community. Although open-access data repositories (OADRs) provide tremendous benefits for ISI researchers and practitioners, determinants of their adoption remain understudied. Drawing on affordance theory and extant ISI literature, this study proposes a factor model to explain how the essential and unique affordances of an OADR (i.e., relevance, accessibility, and integration) affect individual professionals' intentions to use and collaborate with AZSecure, a major OADR. A survey study designed to test the model and hypotheses reveals that the effects of affordances on ISI professionals' intentions to use and collaborate are mediated by perceived usefulness and ease of use, which then jointly determine their perceived value. This study advances ISI research by specifying three important affordances of OADRs; it also contributes to extant technology adoption literature by scrutinizing and affirming the interplay of essential user acceptance and value perceptions to explain ISI professionals' adoptions of OADRs.

SSM: A Semantic Metasearch Platform for Scientific Data retrieval

Revista de Informática Teórica e Aplicada ◽

10.22456/2175-2745.119164 ◽

2022 ◽

Vol 29 (1) ◽

pp. 91-101

Author(s):

Gustavo Caetano Borges ◽

Julio Cesar Dos Reis ◽

Claudia Bauzer Medeiros

Keyword(s):

Scientific Research ◽

Data Retrieval ◽

Scientific Data ◽

Research Data ◽

Use Case ◽

Data Repositories ◽

Metadata Standards ◽

Public Repositories

Scientific research in all fields has advanced in complexity and in the amount of data generated. The heterogeneity of data repositories, data meaning and their metadata standards makes this problem even more significant. In spite of several proposals to find and retrieve research data from public repositories, there is still need for more comprehensive retrieval solutions. In this article, we specify and develop a mechanism to search for scientific data that takes advantage of metadata records and semantic methods. We present the conception of our architecture and how we have implemented it in a use case in the agriculture domain.

SRA and Genbank BioSample-Linked Submission with Mercury_Prep and Mercury_Batch v2

10.17504/protocols.io.b3jaqkie ◽

2022 ◽

Author(s):

Francis J Ambrosio

Keyword(s):

Public Health ◽

Sequencing Data ◽

Public Health Community ◽

Data Repositories ◽

The Public ◽

Detailed Procedure ◽

Health Community ◽

Public Data

Submitting sequencing data to public data repositories is a meaningful yet tedious procedure. Linking submissions between SRA and Genbank will enhance the value of both submissions the the public health community. The Mercury protocols offered by Theiagen Genomics allows users to efficiently and accurately produce all required inputs for SRA and Genbank submissions (the Mercury workflows also allow for GISAID submission, but that will not be covered in this protocol). This protocol provides a detailed procedure for submitting BioSample-linked sequencing data to SRA and Genbank.

Precision Medicine Landscape of Genomic Testing for Patients With Cancer in the National Institutes of Health All of Us Database Using Informatics Approaches

JCO Clinical Cancer Informatics ◽

10.1200/cci.21.00152 ◽

2022 ◽

Author(s):

Jay G. Ronquillo ◽

William T. Lester

Keyword(s):

Precision Medicine ◽

Race And Ethnicity ◽

Gene Mutations ◽

National Institutes Of Health ◽

Genomic Testing ◽

Biomedical Data ◽

Data Repositories ◽

Cancer Data ◽

Patients With Cancer ◽

Significant Difference

PURPOSE The rapid growth of biomedical data ecosystems has catalyzed research for oncology and precision medicine. We leverage federal cloud-based precision medicine databases and tools to better understand the current landscape of precision medicine and genomic testing for patients with cancer. METHODS Retrospective observational study of genomic testing for patients with cancer in the National Institutes of Health All of Us Research Program, with the cancer cohort defined as having at least two documented or reported cancer diagnoses. RESULTS There were 5,678 (1.8%) All of Us participants in the cancer cohort, with a significant difference between cancer status by age category, sex, race, and ethnicity ( P < .001 for all). There were 295 (5.2%) patients with cancer who received genomic testing compared with 6,734 (2.2%) of noncancer patients, with 752 genomic tests commonly focused on gene mutations (primarily pharmacogenomics), molecular pathology, or clinical cytogenetic reports. CONCLUSION Although not yet ubiquitous, diverse clinical genomic analyses in oncology can set the stage to grow the practice of precision medicine by integrating research patient data repositories, cancer data ecosystems, and biomedical informatics.

Model, Integrate, Search... Repeat: A Sound Approach to Building Integrated Repositories of Genomic Data

Special Topics in Information Technology - SpringerBriefs in Applied Sciences and Technology ◽

10.1007/978-3-030-85918-3_8 ◽

2022 ◽

pp. 89-99

Author(s):

Anna Bernasconi

Keyword(s):

Genomic Data ◽

Transformation Process ◽

Data Sources ◽

Data Repositories ◽

Genomic Data Integration ◽

Data Formats ◽

Public Data ◽

The Common ◽

Human Genomic ◽

User Friendly

AbstractA wealth of public data repositories is available to drive genomics and clinical research. However, there is no agreement among the various data formats and models; in the common practice, data sources are accessed one by one, learning their specific descriptions with tedious efforts. In this context, the integration of genomic data and of their describing metadata becomes—at the same time—an important, difficult, and well-recognized challenge. In this chapter, after overviewing the most important human genomic data players, we propose a conceptual model of metadata and an extended architecture for integrating datasets, retrieved from a variety of data sources, based upon a structured transformation process; we then describe a user-friendly search system providing access to the resulting consolidated repository, enriched by a multi-ontology knowledge base. Inspired by our work on genomic data integration, during the COVID-19 pandemic outbreak we successfully re-applied the previously proposed model-build-search paradigm, building on the analogies among the human and viral genomics domains. The availability of conceptual models, related databases, and search systems for both humans and viruses will provide important opportunities for research, especially if virus data will be connected to its host, provider of genomic and phenotype information.

Future Trends in Data Science

International Journal of Advanced Research in Science, Communication and Technology ◽

10.48175/ijarsct-2199 ◽

2021 ◽

pp. 364-372

Author(s):

Yaasmin Attarwala ◽

Sakshi Baid

Keyword(s):

Data Mining ◽

Big Data ◽

Business Intelligence ◽

Data Science ◽

Future Trends ◽

Hidden Information ◽

Data Repositories ◽

Huge Data ◽

The Future

With progression in technology, an enormous magnitude of information being collected from digital users by various businesses and organizations, has resulted in formation of huge data repositories commonly known by the term Big data. Data mining is a tool used for extracting hidden information from these vast databases to identify unique patterns and rules. The present paper aims to provide a detailed description of the importance of big data in today’s times, its characteristics, how data mining plays an important role in big data, why it is a necessity in today’s times, the process of data mining and functionalities it performs, data mining techniques such as classification, clustering etc. that help in finding the patterns to decide upon the future trends in businesses and applications of the same in various fields. The paper also discusses the important role of data mining in Business Intelligence (BI) and various industries, to identify unique patterns and obtain results from the data along with the second half of the paper focusing on further exploring the challenges that are faced in big data and tools used, the applications and upcoming trends in data science and lastly, the scope and importance of data science in the future.

Research data management and services: Resources for different data practitioners

IASSIST Quarterly ◽

10.29173/iq995 ◽

2021 ◽

Vol 45 (3-4) ◽

Author(s):

Gilbert Mushi

Keyword(s):

Data Management ◽

Developed Countries ◽

Management Plan ◽

Research Data ◽

Data Repository ◽

Data Repositories ◽

Research Libraries ◽

Research Data Management ◽

Metadata Standards ◽

Training Resources

The emergence of data-driven research and demands for the establishment of Research Data Management (RDM) has created interest in academic institutions and research organizations globally. Some of the libraries especially in developed countries have started offering RDM services to their communities. Although lagging behind, some academic libraries in developing countries are at the stage of planning or implementing the service. However, the level of RDM awareness is very low among researchers, librarians and other data practitioners. The objective of this paper is to present available open resources for different data practitioners particularly researchers and librarians. It includes training resources for both researchers and librarians, Data Management Plan (DMP) tool for researchers; data repositories available for researchers to freely archive and share their research data to the local and international communities. A case study with a survey was conducted at the University of Dodoma to identify relevant RDM services so that librarians could assist researchers to make their data accessible to the local and international community. The study findings revealed a low level of RDM awareness among researchers and librarians. Over 50% of the respondent indicated their perceived knowledge as poor in the following RDM knowledge areas; DMP, data repository, long term digital preservation, funders RDM mandates, metadata standards describing data and general awareness of RDM. Therefore, this paper presents available open resources for different data practitioners to improve RDM knowledge and boost the confidence of academic and research libraries in establishing the service.

Potential opportunities and risks of sharing agricultural research data in Tanzania

IASSIST Quarterly ◽

10.29173/iq997 ◽

2021 ◽

Vol 45 (3-4) ◽

Author(s):

Anajoyce Samuel Katabalwa ◽

Jo Bates ◽

Pamela Abbott

Keyword(s):

Data Management ◽

Data Sharing ◽

Agricultural Research ◽

Research Output ◽

Open Data ◽

Research Literature ◽

Research Data ◽

Data Repositories ◽

The Government ◽

Support Research

Purpose: The purpose of this paper was to examine the potential opportunities and risks of sharing agricultural research data in Tanzania identified in the existing research literature. Design/methodology/approach: The study involved a review of the literature on research data sharing practices. Findings: The findings indicate that, research data sharing have significant positive benefits among researchers such as increase high research impact; enhancing international community collaboration among researchers with same interests; improving scientific transparency and accuracy of data (Rappert and Bezuidenhout, 2016); increasing research output whereby a single dataset can be used to generate more than one article by different authors; and many more. The risks hampering data sharing practices includes researchers’ fears that data will be scooped, poached or misused (Onyancha, 2016); unreliable electric power; lack of fund to support research data sharing activities; absence of institutional governmental support for data management; perceived lack of evidence benefits (Leonelli, Rappert and Bezuidenhout, 2018); and others. However, in Tanzania research data sharing is relatively new, thus, no any governmental agency mandating or encouraging research data sharing; therefore, there is no research data management; no research open data repositories and no research data sharing policy at any agricultural institution in Tanzania. The study recommends that agricultural researchers should be sensitized to share their data, research data policy and data repositories should also be established to support data sharing practices in Tanzania. Originality and usefulness: From the available literature, this has been the first time that an effort has been made to examine the potential opportunities and risks of sharing agricultural research data in Tanzania. The study could be used by agricultural institutions and other institutions to assess the researchers’ needs in supporting research data sharing. Also, it can be used by the government and institutions to see the need of establishing open data repositories and open data policies to support research data sharing.

Crosswalk among Prominent Open Research Data Repositories

Webology ◽

10.14704/web/v18i2/web18307 ◽

2021 ◽

Vol 18 (2) ◽

pp. 60-67

Author(s):

Dr.M. Krishnamurthy ◽

Dr. Bhalachandra S. Deshpande ◽

Dr.C. Sajana

Keyword(s):

Open Access ◽

Open Data ◽

Research Data ◽

Data Repositories ◽

Digital Repositories ◽

Open Research ◽

Global Movement ◽

Access To Knowledge ◽

Metadata Harvesting ◽

Data Upload

Open Access is a synergised global movement using Internet to provide equal access to knowledge that once hid behind the subscription paywalls. Many new models for scholarly communication have emerged in recent past. One among them is institutional or digital repositories which archive the scholarly content of an organization. While the concept of Open Access opened new arena for institutional or digital repositories in the form of Open repositories. Likewise, the Open repositories for Research Data Management (RDM) are initiative to organize, store, cite, preserve, and share the collected data derived from the research. There are many multidisciplinary and subject specific open repositories for RDM offering exquisite features for perpetual management of research data. The objective of the present study is to evaluate features of popular Open Data Repositories-Zenodo, FigShare, Harvard Dataverse and Mendeley Data. The evaluation provided insights about the key features of the selected Open Data Repositories and which enable us to select the best among them. Zenodo provides maximum data upload limit. While the major features required by a researcher like DOI, File Types, citation support, licenses, search (metadata harvesting) are provided by all three repositories.

Systems Biology–Derived Genetic Signatures of Mastitis in Dairy Cattle: A New Avenue for Drug Repurposing

Animals ◽

10.3390/ani12010029 ◽

2021 ◽

Vol 12 (1) ◽

pp. 29

Author(s):

Somayeh Sharifi ◽

Maryam Lotfi Shahreza ◽

Abbas Pakdel ◽

James M. Reecy ◽

Nasser Ghadiri ◽

...

Keyword(s):

Dairy Cattle ◽

Clinical Signs ◽

Drug Repurposing ◽

Severe Infection ◽

Label Propagation ◽

Global Network ◽

Biological Information ◽

Online Data ◽

Data Repositories ◽

E Coli

Mastitis, a disease with high incidence worldwide, is the most prevalent and costly disease in the dairy industry. Gram-negative bacteria such as Escherichia coli (E. coli) are assumed to be among the leading agents causing acute severe infection with clinical signs. E. Coli, environmental mastitis pathogens, are the primary etiological agents of bovine mastitis in well-managed dairy farms. Response to E. Coli infection has a complex pattern affected by genetic and environmental parameters. On the other hand, the efficacy of antibiotics and/or anti-inflammatory treatment in E. coli mastitis is still a topic of scientific debate, and studies on the treatment of clinical cases show conflicting results. Unraveling the bio-signature of mastitis in dairy cattle can open new avenues for drug repurposing. In the current research, a novel, semi-supervised heterogeneous label propagation algorithm named Heter-LP, which applies both local and global network features for data integration, was used to potentially identify novel therapeutic avenues for the treatment of E. coli mastitis. Online data repositories relevant to known diseases, drugs, and gene targets, along with other specialized biological information for E. coli mastitis, including critical genes with robust bio-signatures, drugs, and related disorders, were used as input data for analysis with the Heter-LP algorithm. Our research identified novel drugs such as Glibenclamide, Ipratropium, Salbutamol, and Carbidopa as possible therapeutics that could be used against E. coli mastitis. Predicted relationships can be used by pharmaceutical scientists or veterinarians to find commercially efficacious medicines or a combination of two or more active compounds to treat this infectious disease.

data repositories
Recently Published Documents

TOTAL DOCUMENTS

H-INDEX

Key Factors Affecting User Adoption of Open-Access Data Repositories in Intelligence and Security Informatics: An Affordance Perspective

SSM: A Semantic Metasearch Platform for Scientific Data retrieval

SRA and Genbank BioSample-Linked Submission with Mercury_Prep and Mercury_Batch v2

Precision Medicine Landscape of Genomic Testing for Patients With Cancer in the National Institutes of Health All of Us Database Using Informatics Approaches

Model, Integrate, Search... Repeat: A Sound Approach to Building Integrated Repositories of Genomic Data

Future Trends in Data Science

Research data management and services: Resources for different data practitioners

Potential opportunities and risks of sharing agricultural research data in Tanzania

Crosswalk among Prominent Open Research Data Repositories

Systems Biology–Derived Genetic Signatures of Mastitis in Dairy Cattle: A New Avenue for Drug Repurposing

Export Citation Format

data repositoriesRecently Published Documents

TOTAL DOCUMENTS

H-INDEX

Key Factors Affecting User Adoption of Open-Access Data Repositories in Intelligence and Security Informatics: An Affordance Perspective

SSM: A Semantic Metasearch Platform for Scientific Data retrieval

SRA and Genbank BioSample-Linked Submission with Mercury_Prep and Mercury_Batch v2

Precision Medicine Landscape of Genomic Testing for Patients With Cancer in the National Institutes of Health All of Us Database Using Informatics Approaches

Model, Integrate, Search... Repeat: A Sound Approach to Building Integrated Repositories of Genomic Data

Future Trends in Data Science

Research data management and services: Resources for different data practitioners

Potential opportunities and risks of sharing agricultural research data in Tanzania

Crosswalk among Prominent Open Research Data Repositories

Systems Biology–Derived Genetic Signatures of Mastitis in Dairy Cattle: A New Avenue for Drug Repurposing

data repositories
Recently Published Documents