data repositories
Recently Published Documents





2022 ◽  
Vol 13 (1) ◽  
pp. 1-24
Bo Wen ◽  
Paul Jen-Hwa Hu ◽  
Mohammadreza Ebrahimi ◽  
Hsinchun Chen

Rich, diverse cybersecurity data are critical for efforts by the intelligence and security informatics (ISI) community. Although open-access data repositories (OADRs) provide tremendous benefits for ISI researchers and practitioners, determinants of their adoption remain understudied. Drawing on affordance theory and extant ISI literature, this study proposes a factor model to explain how the essential and unique affordances of an OADR (i.e., relevance, accessibility, and integration) affect individual professionals' intentions to use and collaborate with AZSecure, a major OADR. A survey study designed to test the model and hypotheses reveals that the effects of affordances on ISI professionals' intentions to use and collaborate are mediated by perceived usefulness and ease of use, which then jointly determine their perceived value. This study advances ISI research by specifying three important affordances of OADRs; it also contributes to extant technology adoption literature by scrutinizing and affirming the interplay of essential user acceptance and value perceptions to explain ISI professionals' adoptions of OADRs.

2022 ◽  
Vol 29 (1) ◽  
pp. 91-101
Gustavo Caetano Borges ◽  
Julio Cesar Dos Reis ◽  
Claudia Bauzer Medeiros

Scientific research in all fields has advanced in complexity and in the amount of data generated. The heterogeneity of data repositories, data meaning and their metadata standards makes this problem even more significant. In spite of several proposals to find and retrieve research data from public repositories, there is still need for more comprehensive retrieval solutions. In this article, we specify and develop a mechanism to search for scientific data that takes advantage of metadata records and semantic methods. We present the conception of our architecture and how we have implemented it in a use case in the agriculture domain.

2022 ◽  
Francis J Ambrosio

Submitting sequencing data to public data repositories is a meaningful yet tedious procedure. Linking submissions between SRA and Genbank will enhance the value of both submissions the the public health community. The Mercury protocols offered by Theiagen Genomics allows users to efficiently and accurately produce all required inputs for SRA and Genbank submissions (the Mercury workflows also allow for GISAID submission, but that will not be covered in this protocol). This protocol provides a detailed procedure for submitting BioSample-linked sequencing data to SRA and Genbank.

Anna Bernasconi

AbstractA wealth of public data repositories is available to drive genomics and clinical research. However, there is no agreement among the various data formats and models; in the common practice, data sources are accessed one by one, learning their specific descriptions with tedious efforts. In this context, the integration of genomic data and of their describing metadata becomes—at the same time—an important, difficult, and well-recognized challenge. In this chapter, after overviewing the most important human genomic data players, we propose a conceptual model of metadata and an extended architecture for integrating datasets, retrieved from a variety of data sources, based upon a structured transformation process; we then describe a user-friendly search system providing access to the resulting consolidated repository, enriched by a multi-ontology knowledge base. Inspired by our work on genomic data integration, during the COVID-19 pandemic outbreak we successfully re-applied the previously proposed model-build-search paradigm, building on the analogies among the human and viral genomics domains. The availability of conceptual models, related databases, and search systems for both humans and viruses will provide important opportunities for research, especially if virus data will be connected to its host, provider of genomic and phenotype information.

Jay G. Ronquillo ◽  
William T. Lester

PURPOSE The rapid growth of biomedical data ecosystems has catalyzed research for oncology and precision medicine. We leverage federal cloud-based precision medicine databases and tools to better understand the current landscape of precision medicine and genomic testing for patients with cancer. METHODS Retrospective observational study of genomic testing for patients with cancer in the National Institutes of Health All of Us Research Program, with the cancer cohort defined as having at least two documented or reported cancer diagnoses. RESULTS There were 5,678 (1.8%) All of Us participants in the cancer cohort, with a significant difference between cancer status by age category, sex, race, and ethnicity ( P < .001 for all). There were 295 (5.2%) patients with cancer who received genomic testing compared with 6,734 (2.2%) of noncancer patients, with 752 genomic tests commonly focused on gene mutations (primarily pharmacogenomics), molecular pathology, or clinical cytogenetic reports. CONCLUSION Although not yet ubiquitous, diverse clinical genomic analyses in oncology can set the stage to grow the practice of precision medicine by integrating research patient data repositories, cancer data ecosystems, and biomedical informatics.

Yaasmin Attarwala ◽  
Sakshi Baid

With progression in technology, an enormous magnitude of information being collected from digital users by various businesses and organizations, has resulted in formation of huge data repositories commonly known by the term Big data. Data mining is a tool used for extracting hidden information from these vast databases to identify unique patterns and rules. The present paper aims to provide a detailed description of the importance of big data in today’s times, its characteristics, how data mining plays an important role in big data, why it is a necessity in today’s times, the process of data mining and functionalities it performs, data mining techniques such as classification, clustering etc. that help in finding the patterns to decide upon the future trends in businesses and applications of the same in various fields. The paper also discusses the important role of data mining in Business Intelligence (BI) and various industries, to identify unique patterns and obtain results from the data along with the second half of the paper focusing on further exploring the challenges that are faced in big data and tools used, the applications and upcoming trends in data science and lastly, the scope and importance of data science in the future.

2021 ◽  
Vol 45 (3-4) ◽  
Gilbert Mushi

The emergence of data-driven research and demands for the establishment of Research Data Management (RDM) has created interest in academic institutions and research organizations globally. Some of the libraries especially in developed countries have started offering RDM services to their communities. Although lagging behind, some academic libraries in developing countries are at the stage of planning or implementing the service. However, the level of RDM awareness is very low among researchers, librarians and other data practitioners. The objective of this paper is to present available open resources for different data practitioners particularly researchers and librarians. It includes training resources for both researchers and librarians, Data Management Plan (DMP) tool for researchers; data repositories available for researchers to freely archive and share their research data to the local and international communities.   A case study with a survey was conducted at the University of Dodoma to identify relevant RDM services so that librarians could assist researchers to make their data accessible to the local and international community. The study findings revealed a low level of RDM awareness among researchers and librarians. Over 50% of the respondent indicated their perceived knowledge as poor in the following RDM knowledge areas; DMP, data repository, long term digital preservation, funders RDM mandates, metadata standards describing data and general awareness of RDM. Therefore, this paper presents available open resources for different data practitioners to improve RDM knowledge and boost the confidence of academic and research libraries in establishing the service.

2021 ◽  
Vol 45 (3-4) ◽  
Anajoyce Samuel Katabalwa ◽  
Jo Bates ◽  
Pamela Abbott

Purpose: The purpose of this paper was to examine the potential opportunities and risks of sharing agricultural research data in Tanzania identified in the existing research literature. Design/methodology/approach: The study involved a review of the literature on research data sharing practices. Findings: The findings indicate that, research data sharing have significant positive benefits among researchers such as increase high research impact; enhancing international community collaboration among researchers with same interests; improving scientific transparency and accuracy of data (Rappert and Bezuidenhout, 2016); increasing research output whereby a single dataset can be used to generate more than one article by different authors; and many more. The risks hampering data sharing practices includes researchers’ fears that data will be scooped, poached or misused (Onyancha, 2016); unreliable electric power; lack of fund to support research data sharing activities; absence of institutional governmental support for data management; perceived lack of evidence benefits (Leonelli, Rappert and Bezuidenhout, 2018); and others. However, in Tanzania research data sharing is relatively new, thus, no any governmental agency mandating or encouraging research data sharing; therefore, there is no research data management; no research open data repositories and no research data sharing policy at any agricultural institution in Tanzania. The study recommends that agricultural researchers should be sensitized to share their data, research data policy and data repositories should also be established to support data sharing practices in Tanzania. Originality and usefulness: From the available literature, this has been the first time that an effort has been made to examine the potential opportunities and risks of sharing agricultural research data in Tanzania. The study could be used by agricultural institutions and other institutions to assess the researchers’ needs in supporting research data sharing. Also, it can be used by the government and institutions to see the need of establishing open data repositories and open data policies to support research data sharing.

Webology ◽  
2021 ◽  
Vol 18 (2) ◽  
pp. 60-67
Dr.M. Krishnamurthy ◽  
Dr. Bhalachandra S. Deshpande ◽  
Dr.C. Sajana

Open Access is a synergised global movement using Internet to provide equal access to knowledge that once hid behind the subscription paywalls. Many new models for scholarly communication have emerged in recent past. One among them is institutional or digital repositories which archive the scholarly content of an organization. While the concept of Open Access opened new arena for institutional or digital repositories in the form of Open repositories. Likewise, the Open repositories for Research Data Management (RDM) are initiative to organize, store, cite, preserve, and share the collected data derived from the research. There are many multidisciplinary and subject specific open repositories for RDM offering exquisite features for perpetual management of research data. The objective of the present study is to evaluate features of popular Open Data Repositories-Zenodo, FigShare, Harvard Dataverse and Mendeley Data. The evaluation provided insights about the key features of the selected Open Data Repositories and which enable us to select the best among them. Zenodo provides maximum data upload limit. While the major features required by a researcher like DOI, File Types, citation support, licenses, search (metadata harvesting) are provided by all three repositories.

Animals ◽  
2021 ◽  
Vol 12 (1) ◽  
pp. 29
Somayeh Sharifi ◽  
Maryam Lotfi Shahreza ◽  
Abbas Pakdel ◽  
James M. Reecy ◽  
Nasser Ghadiri ◽  

Mastitis, a disease with high incidence worldwide, is the most prevalent and costly disease in the dairy industry. Gram-negative bacteria such as Escherichia coli (E. coli) are assumed to be among the leading agents causing acute severe infection with clinical signs. E. Coli, environmental mastitis pathogens, are the primary etiological agents of bovine mastitis in well-managed dairy farms. Response to E. Coli infection has a complex pattern affected by genetic and environmental parameters. On the other hand, the efficacy of antibiotics and/or anti-inflammatory treatment in E. coli mastitis is still a topic of scientific debate, and studies on the treatment of clinical cases show conflicting results. Unraveling the bio-signature of mastitis in dairy cattle can open new avenues for drug repurposing. In the current research, a novel, semi-supervised heterogeneous label propagation algorithm named Heter-LP, which applies both local and global network features for data integration, was used to potentially identify novel therapeutic avenues for the treatment of E. coli mastitis. Online data repositories relevant to known diseases, drugs, and gene targets, along with other specialized biological information for E. coli mastitis, including critical genes with robust bio-signatures, drugs, and related disorders, were used as input data for analysis with the Heter-LP algorithm. Our research identified novel drugs such as Glibenclamide, Ipratropium, Salbutamol, and Carbidopa as possible therapeutics that could be used against E. coli mastitis. Predicted relationships can be used by pharmaceutical scientists or veterinarians to find commercially efficacious medicines or a combination of two or more active compounds to treat this infectious disease.

Sign in / Sign up

Export Citation Format

Share Document