scholarly journals The systematic assessment of completeness of public metadata accompanying omics studies

2021 ◽  
Author(s):  
Yu-Ning Huang ◽  
Anushka Rajesh ◽  
Ram Ayyala ◽  
Aditya Sarkar ◽  
Ruiwei Guo ◽  
...  

The scientific community has accumulated enormous amounts of genomic data stored in specialized public repositories. Genomic data is easily accessible and available from public genomic repositories allowing the biomedical community to effectively share the omics datasets. However, improperly annotated or incomplete metadata accompanying the raw omics data can negatively impact the utility of shared for secondary analysis. In this study, we perform a comprehensive analysis under 137 studies over 18,559 samples across six therapeutics fields to assess the completeness of metadata accompanying omics studies in both publication and its related online repositories across and make observations about how the process of data sharing could be made reliable. This analysis involved an initial literature survey in finding studies based on the seven therapeutic fields, that are Alzheimers disease, acute myeloid leukemia, cystic fibrosis, cardiovascular diseases, inflammatory bowel disease, sepsis, and tuberculosis. We carefully examined the availability of metadata over nine clinical variables, that included disease condition, age, organism, sex, tissue type, ethnicity, country, mortality, and clinical severity. By comparing the metadata availability in both original publications and online repositories, we observed discrepancies in sharing the metadata. We determine that the overall availability of metadata is 72.8%, where the most complete reported phenotypes are disease condition and organism, and the least is mortality. Additionally, we examined the completeness of metadata reported separately in original publications and online repositories. The completeness of metadata from the original publication across the nine clinical phenotypes is 71.1%. In contrast, the overall completeness of metadata information from the public repositories is 48.6%. Our study is the first one to assess the completeness of metadata accompanying raw data across a large number of studies and phenotypes and opens a crucial discussion about solutions to improve completeness and accessibility of metadata accompanying omics studies.

Gut ◽  
2012 ◽  
Vol 61 (Suppl 2) ◽  
pp. A225.1-A225
Author(s):  
L Alrubaiy ◽  
J Williams ◽  
H Hutchings

2021 ◽  
pp. 1-12
Author(s):  
Holly Etchegary ◽  
Daryl Pullman ◽  
Charlene Simmonds ◽  
Zoha Rabie ◽  
Proton Rahman

<b><i>Introduction:</i></b> The growth of global sequencing initiatives and commercial genomic test offerings suggests the public will increasingly be confronted with decisions about sequencing. Understanding public attitudes can assist efforts to integrate sequencing into care and inform the development of public education and outreach strategies. <b><i>Methods:</i></b> A 48-item online survey was advertised on Facebook in Eastern Canada and hosted on SurveyMonkey in late 2018. The survey measured public interest in whole genome sequencing and attitudes toward various aspects of sequencing using vignettes, scaled, and open-ended items. <b><i>Results:</i></b> While interest in sequencing was high, critical attitudes were observed. In particular, items measuring features of patient control and choice regarding genomic data were strongly endorsed by respondents. Majority wanted to specify upfront how their data could be used, retain the ability to withdraw their sample at a later date, sign a written consent form, and speak to a genetic counselor prior to sequencing. Concerns about privacy and unauthorized access to data were frequently observed. Education level was the sociodemographic variable most often related to attitude statements such that those with higher levels of education generally displayed more critical attitudes. <b><i>Conclusions:</i></b> Attitudes identified here could be used to inform the development of implementation strategies for genomic medicine. Findings suggest health systems must address patient concerns about privacy, consent practices, and the strong desire to control what happens to their genomic data through public outreach and education. Specific oversight procedures and policies that are clearly communicated to the public will be required.


2019 ◽  
Author(s):  
Andrea Sanchini ◽  
Christine Jandrasits ◽  
Julius Tembrockhaus ◽  
Thomas Andreas Kohl ◽  
Christian Utpatel ◽  
...  

AbstractIntroductionImproving the surveillance of tuberculosis (TB) is especially important for multidrug-resistant (MDR) and extensively drug-resistant (XDR)-TB. The large amount of publicly available whole-genome sequencing (WGS) data for TB gives us the chance to re-use data and to perform additional analysis at a large scale.AimWe assessed the usefulness of raw WGS data of global MDR/XDR-TB isolates available from public repositories to improve TB surveillance.MethodsWe extracted raw WGS data and the related metadata of Mycobacterium tuberculosis isolates available from the Sequence Read Archive. We compared this public dataset with WGS data and metadata of 131 MDR- and XDR-TB isolates from Germany in 2012-2013.ResultsWe aggregated a dataset that includes 1,081 MDR and 250 XDR isolates among which we identified 133 molecular clusters. In 16 clusters, the isolates were from at least two different countries. For example, cluster2 included 56 MDR/XDR isolates from Moldova, Georgia, and Germany. By comparing the WGS data from Germany and the public dataset, we found that 11 clusters contained at least one isolate from Germany and at least one isolate from another country. We could, therefore, connect TB cases despite missing epidemiological information.ConclusionWe demonstrated the added value of using WGS raw data from public repositories to contribute to TB surveillance. By comparing the German and the public dataset, we identified potential international transmission events. Thus, using this approach might support the interpretation of national surveillance results in an international context.


2015 ◽  
Vol 43 (3) ◽  
pp. 523-528
Author(s):  
Gloria M. Petersen ◽  
Brian Van Ness

Given the nature of scientific inquiry, biomedical and genomic researchers have forged innumerable ways to advance our understanding of human disease. In many cases, research requires the involvement of human subjects, and in a subset of these studies, the researcher may collect data and biospecimens from many participants, and even serially collect additional materials over time and across a number of geographically dispersed centers. The organized data and biospecimens are collectively known as research biobanks. Researchers have an obligation to disseminate findings from their research through publications and presentations to other professionals, and when possible, to the public. Sharing genomic data is increasingly being mandated; access to data can be obtained through collaborative or state-funded entities. For example, the database of Genotypes and Phenotypes (dbGAP) and the International Cancer Genome Consortium will grant approved research applicants access to de-identified individual level genomic data with accompanying demographic/clinical information.


F1000Research ◽  
2018 ◽  
Vol 6 ◽  
pp. 2120
Author(s):  
Adva Yeheskel ◽  
Adam Reiter ◽  
Metsada Pasmanik-Chor ◽  
Amir Rubinstein

Motivation: Many biologists are discouraged from using network simulation tools because these require manual, often tedious network construction. This situation calls for building new tools or extending existing ones with the ability to import biological pathways previously deposited in databases and analyze them, in order to produce novel biological insights at the pathway level. Results: We have extended a network simulation tool (BioNSi), which now allows merging of multiple pathways from the KEGG pathway database into a single, coherent network, and visualizing its properties. Furthermore, the enhanced tool enables loading experimental expression data into the network and simulating its dynamics under various biological conditions or perturbations. As a proof of concept, we tested two sets of published experimental data, one related to inflammatory bowel disease condition and the other to breast cancer treatment. We predict some of the major observations obtained following these laboratory experiments, and provide new insights that may shed additional light on these results. Tool requirements: Cytoscape 3.x, JAVA 8 Availability: The tool is freely available at http://bionsi.wix.com/bionsi, where a complete user guide and a step-by-step manual can also be found.


Author(s):  
Christine Verdon ◽  
Jason Reinglas ◽  
Janie Coulombe ◽  
Lorant Gonczi ◽  
Talat Bessissow ◽  
...  

Abstract Background Crohn disease (CD) and ulcerative colitis (UC) have high health care expenditures because of medications, hospitalizations, and surgeries. We evaluated disease outcomes and treatment algorithms of patients with inflammatory bowel disease (IBD) in Québec, comparing periods before and after 2010. Methods The province of Québec’s public health administrative database was used to identify newly diagnosed patients with IBD between 1996 and 2015. The primary and secondary outcomes included time to and probability of first and second IBD-related hospitalizations, first and second major surgery, and medication exposures. Medication prescriptions were collected from the public prescription database. Results We identified 34,644 newly diagnosed patients with IBD (CD = 59.5%). The probability of the first major surgery increased after 2010 in patients with CD (5 years postdiagnosis before and after 2010: 8% [SD = 0.2%] vs 15% [SD = 0.6%]; P &lt; 0.0001) and patients with UC (6% [SD = 0.2%] vs 10% [SD = 0.6%] ;P &lt; 0.0001). The probability of the second major surgery was unchanged in patients with CD. Hospitalization rates remained unchanged. Patients on anti-tumor necrosis factor (anti-TNF) medications had the lowest probability of hospitalizations (overall 5-year probability in patients with IBD stratified by maximal therapeutic step: 5-aminosalicylic acids 37% [SD = 0.6%]; anti-TNFs 31% [SD = 1.8%]; P &lt; 0.0001). Anti-TNFs were more commonly prescribed for patients with CD after 2010 (4% [SD = 0.2%] vs 16% [SD = 0.6%]; P &lt; 0.0001) in the public health insurance plan, especially younger patients. Corticosteroid exposure was unchanged before and after 2010. Immunosuppressant use was low but increased after 2010. The use of 5-ASAs was stable in patients with UC but decreased in patients with CD. Conclusions The probability of first and second hospitalizations remained unchanged in Québec and the probability of major surgery was low overall but did increase despite the higher and earlier use of anti-TNFs.


2019 ◽  
Vol 28 (4) ◽  
pp. 424-434 ◽  
Author(s):  
Anna Middleton ◽  
◽  
Richard Milne ◽  
Heidi Howard ◽  
Emilia Niemiec ◽  
...  

AbstractPublic acceptance is critical for sharing of genomic data at scale. This paper examines how acceptance of data sharing pertains to the perceived similarities and differences between DNA and other forms of personal data. It explores the perceptions of representative publics from the USA, Canada, the UK and Australia (n = 8967) towards the donation of DNA and health data. Fifty-two percent of this public held ‘exceptionalist’ views about genetics (i.e., believed DNA is different or ‘special’ compared to other types of medical information). This group was more likely to be familiar with or have had personal experience with genomics and to perceive DNA information as having personal as well as clinical and scientific value. Those with personal experience with genetics and genetic exceptionalist views were nearly six times more likely to be willing to donate their anonymous DNA and medical information for research than other respondents. Perceived harms from re-identification did not appear to dissuade publics from being willing to participate in research. The interplay between exceptionalist views about genetics and the personal, scientific and clinical value attributed to data would be a valuable focus for future research.


Sign in / Sign up

Export Citation Format

Share Document