scholarly journals WebSeq: A Genomic Data Analytics Platform for Monogenic Disease Discovery

2021 ◽  
Author(s):  
Milind Agarwal ◽  
Kshitiz Ghimire ◽  
Joy D. Cogan ◽  
Janet Markle ◽  

Whole exome sequencing (WES) is commonly used to study monogenic diseases. The application of this sequencing technology has gained in popularity amongst clinicians and researchers as WES pricing has declined. The accumulation of WES data creates a need for a robust, flexible, scalable and easy-to-use analytics platform to allow researchers to gain biological insight from this genomic data. We present WebSeq, a self-contained server and web interface to facilitate intuitive analysis of WES data. WebSeq provides access to sophisticated tools and pipelines through a user-friendly and modern web interface. WebSeq has modules that support i) FASTQ to VCF conversion, ii) VCF to ANNOVAR CSV conversion, iii) family-based analyses for Mendelian disease gene discovery, iv) cohort-wide gene enrichment analyses, (v) an automated IGV browser, and (vi) a 'virtual gene panel' analysis module. WebSeq Pro, our expanded pipeline, also supports SNP genotype analyses such as ancestry inference and kinship testing. WebSeq Lite, our minimal pipeline, supports family-based analyses, cohort-wide gene enrichment analyses, and a virtual gene panel along with the IGV browser module. We anticipate that the rigorous use of our web application will allow researchers to expedite discoveries from human genomic data. WebSeq Lite, WebSeq, and WebSeq Pro are fully containerized using Docker, run on all major operating systems, and are freely available for personal, academic, and non-profit use at http://bitly.ws/g6cn .

2021 ◽  
Vol 12 ◽  
Author(s):  
Paweł Sztromwasser ◽  
Damian Skrzypczak ◽  
Arkadiusz Michalak ◽  
Wojciech Fendler

BackgroundAnalysis of variants in distant regulatory elements could improve the current 25–50% yield of genetic testing for monogenic diseases. However, the vast size of the regulome, great number of variants, and the difficulty in predicting their phenotypic impact make searching for pathogenic variants in the regulatory genome challenging. New tools for the identification of regulatory variants based on their relevance to the phenotype are needed.MethodsWe used tissue-specific regulatory loci mapped by ENCODE and FANTOM, together with miRNA–gene interactions from miRTarBase and miRWalk, to develop Remus, a web application for the identification of tissue-specific regulatory regions. Remus searches for regulatory features linked to the known disease-associated genes and filters them using activity status in the target tissues relevant for the studied disorder. For user convenience, Remus provides a web interface and facilitates in-browser filtering of variant files suitable for sensitive patient data.ResultsTo evaluate our approach, we used a set of 146 regulatory mutations reported causative for 68 distinct monogenic disorders and a manually curated a list of tissues affected by these disorders. In 89.7% of cases, Remus identified the regulator containing the pathogenic mutation. The tissue-specific search limited the number of considered variants by 82.5% as compared to a tissue-agnostic search.ConclusionRemus facilitates the identification of regulatory regions potentially associated with a monogenic disease and can supplement classical analysis of coding variations with the aim of improving the diagnostic yield in whole-genome sequencing experiments.


2020 ◽  
Vol 19 (10) ◽  
pp. 1602-1618 ◽  
Author(s):  
Thibault Robin ◽  
Julien Mariethoz ◽  
Frédérique Lisacek

A key point in achieving accurate intact glycopeptide identification is the definition of the glycan composition file that is used to match experimental with theoretical masses by a glycoproteomics search engine. At present, these files are mainly built from searching the literature and/or querying data sources focused on posttranslational modifications. Most glycoproteomics search engines include a default composition file that is readily used when processing MS data. We introduce here a glycan composition visualizing and comparative tool associated with the GlyConnect database and called GlyConnect Compozitor. It offers a web interface through which the database can be queried to bring out contextual information relative to a set of glycan compositions. The tool takes advantage of compositions being related to one another through shared monosaccharide counts and outputs interactive graphs summarizing information searched in the database. These results provide a guide for selecting or deselecting compositions in a file in order to reflect the context of a study as closely as possible. They also confirm the consistency of a set of compositions based on the content of the GlyConnect database. As part of the tool collection of the Glycomics@ExPASy initiative, Compozitor is hosted at https://glyconnect.expasy.org/compozitor/ where it can be run as a web application. It is also directly accessible from the GlyConnect database.


2021 ◽  
Author(s):  
Niklas Hohmann ◽  
Emilia Jarochowska

<p>Fossil accumulations can be generated by (1) high input of organism remains or (2) by low sedimentation rates, reducing the volume of sediment between individual fossils. This creates a paradox, in which shell beds may form in environments with low biomass production. This effect of sedimentary condensation on fossil abundance is easy to understand, however, its implications are hard to grasp and visualize.</p><p>We present the shellbed condensator ( https://stratigraphicpaleobiology.shinyapps.io/shellbed_condensator/ ), a web application that allows to interactively visualize and animate the effects of sedimentary condensation and erosion on fossil abundance and proxies recorded by the sedimentary record. It is an adaptation of the seminal computer simulation by Kidwell (1985). The application is written in R Software and uses the shiny package for the construction of the web interface and the DAIME package for the sedimentological model (Hohmann, 2021). It allows creating stratigraphic expressions and age models for combinations of fossil input and sedimentation rates defined by the user.</p><p>To assess the utility of shiny apps for teaching purposes, we examine student understanding of sedimentary condensation after unsupervised studying and after unsupervised usage of the app. Due to their strong visual and interactive components, shiny apps are a powerful and versatile tool for science communication, teaching, self-study, the visualization of large datasets, and the promotion of scientific findings.</p><p> </p>


2019 ◽  
Author(s):  
R N Ramirez ◽  
K Bedirian ◽  
S M Gray ◽  
A Diallo

Abstract Motivation Visualization of multiple genomic data generally requires the use of public or commercially hosted browsers. Flexible visualization of chromatin interaction data as genomic features and network components offer informative insights to gene expression. An open source application for visualizing HiC and chromatin conformation-based data as 2D-arcs accompanied by interactive network analyses is valuable. Results DNA Rchitect is a new tool created to visualize HiC and chromatin conformation-based contacts at high (Kb) and low (Mb) genomic resolutions. The user can upload their pre-filtered HiC experiment in bedpe format to the DNA Rchitect web app that we have hosted or to a version they themselves have deployed. Using DNA Rchitect, the uploaded data allows the user to visualize different interactions of their sample, perform simple network analyses, while also offering visualization of other genomic data types. The user can then download their results for additional network functionality offered in network based programs such as Cytoscape. Availability and implementation DNA Rchitect is freely available both as a web application written primarily in R available at http://shiny.immgen.org/DNARchitect/ and as an open source released under an MIT license at: https://github.com/alosdiallo/DNA_Rchitect.


Author(s):  
Federica Cariati ◽  
Maria Savarese ◽  
Valeria D’Argenio ◽  
Francesco Salvatore ◽  
Rossella Tomaiuolo

AbstractBackground:The development of technologies that detect monogenic diseases in embryonic and fetal samples are opening novel diagnostic possibilities for preimplantation genetic diagnosis (PGD) and prenatal diagnosis (PND) thereby changing laboratory practice. Molecular diagnostic laboratories use different workflows for PND depending on the disease, type of biological sample, the presence of one or more known mutations, and the availability of the proband. Paternity verification and contamination analysis are also performed. The aim of this study was to test the efficacy of a single workflow designed to optimize the molecular diagnosis of monogenic disease in families at-risk of transmitting a genetic alteration.Methods:We used this strategy, which we designated “SEeMORE strategy” (Results:The results obtained with the SEeMORE strategy concurred with those obtained with traditional PND. In addition, this strategy has several advantages: (i) use of one or a few cells; (ii) reduction of the procedure to 1 day; and (iii) a reduction of at least 2–3-fold of the analytic cost.Conclusions:The SEeMORE strategy is effective for the molecular diagnosis of monogenic diseases, irrespective of the amount of starting material and of the disease mutation, and can be used for PND and PGD.


2017 ◽  
Author(s):  
Marci L. B. Schwartz ◽  
Cara Zayac McCormick ◽  
Amanda L. Lazzeri ◽  
D’Andra M. Lindbuchler ◽  
Miranda L. G. Hallquist ◽  
...  

ABSTRACTBackgroundResearch cohorts with linked genomic data exist, or are being developed, at many research centers. Within any such “sequenced cohort” of more than 100 participants, it is likely that there are participants with previously undisclosed risk for life-threatening monogenic diseases that could be identified with targeted analysis of their existing data. Identification of such disease-associated findings are not usually primary to the enrollment research goals. At Geisinger Health System, MyCode® Community Health Initiative (MyCode) participants represent one such large sequenced cohort. Since 2013, MyCode participants in discovery research have been consented for secondary analysis of their existing research genomic sequences to allow delivery of medically actionable findings to them and their healthcare providers. This return of genomic results program was developed to manage an anticipated 3.5% of MyCode participants who will receive clinically confirmed genomic variants from an approved gene list out of more than 150,000 total participants. Risk-associated DNA sequences alone without any clinical parameter, prompt “genome-first” follow-up encounters.MethodsThis article describes our process for generating clinical grade results from research-based genomic sequencing data, delivering results to patients and their providers, facilitating targeted clinical evaluations of patients and promoting cascade testing of at-risk relatives. We also summarize our early data about the results generated during this process and our ability to contact patients and their providers to disclose the information.ResultsThis process has been used to generate 343 results on 339 patients. 93% of patients with a result have been successfully contacted about their results as evidenced by direct interaction about their result with the research team or a healthcare provider. 222 healthcare providers have been notified of a result on one or more patient through this result delivery process.ConclusionsHere we describe the existing GHS model to deliver genomic data into the electronic medical record and the clinical interactions that are prompted and supported. Elements of this genome-first care model can be applied in other healthcare settings and in national efforts, such as “All of Us”, that wish to establish programs for returning genomic results to research participants.


2021 ◽  
Author(s):  
Alejandro Cisterna García ◽  
Aurora González-Vidal ◽  
Daniel Ruiz Villa ◽  
Jordi Ortiz Murillo ◽  
Alicia Gómez-Pascual ◽  
...  

Gene set based phenotype enrichment analysis (detecting phenotypic terms that emerge as significant in a set of genes) can improve the rate of genetic diagnoses amongst other research purposes. To facilitate diverse phenotype analysis, we developed PhenoExam, a freely available R package for tool developers and a web interface for users, which performs: (1) phenotype and disease enrichment analysis on a gene set; (2) measures statistically significant phenotype similarities between gene sets and (3) detects significant differential phenotypes or disease terms across different databases. PhenoExam achieves these tasks by integrating databases or resources such as the HPO, MGD, CRISPRbrain, CTD, ClinGen, CGI, OrphaNET, UniProt, PsyGeNET, and Genomics England Panel App. PhenoExam accepts both human and mouse genes as input. We developed PhenoExam to assist a variety of users, including clinicians, computational biologists and geneticists. It can be used to support the validation of new gene-to-disease discoveries, and in the detection of differential phenotypes between two gene sets (a phenotype linked to one of the gene set but no to the other) that are useful for differential diagnosis and to improve genetic panels. We validated PhenoExam performance through simulations and its application to real cases. We demonstrate that PhenoExam is effective in distinguishing gene sets or Mendelian diseases with very similar phenotypes through projecting the disease-causing genes into their annotation-based phenotypic spaces. We also tested the tool with early onset Parkinson's disease and dystonia genes, to show phenotype-level similarities but also potentially interesting differences. More specifically, we used PhenoExam to validate computationally predicted new genes potentially associated with epilepsy. Therefore, PhenoExam effectively discovers links between phenotypic terms across annotation databases through effective integration. The R package is available at https://github.com/alexcis95/PhenoExam and the Web tool is accessible at https://snca.atica.um.es/PhenoExamWeb/.


2020 ◽  
Vol 5 (2) ◽  
pp. 185
Author(s):  
Anggi Elanda ◽  
Robby Lintang Buana

Abstract -- OWASP (Open Web Application Security Project) version 4 issued by a non-profit organization called owasp.org which is dedicated to the security of web-based applications. This systematic review is intended to review whether the Open Web Application Security Project (OWASP) method is widely used to detect security in a website-based Information System. In this systematic review, we review 3 literature from several publisher sources and make a comparison regarding OWASP version 4 results and the security level of a web server from the publisher's source.Keywords— OWASP, Website Vulnerability, Website Security Detection


2017 ◽  
Author(s):  
Richard J Challis ◽  
Sujai Kumar ◽  
Lewis Stevens ◽  
Mark Blaxter

As the generation and use of genomic datasets is becoming increasingly common in all areas of biology, the need for resources to collate, analyse and present data from one or more genome projects is becoming more pressing. The Ensembl platform is a powerful tool to make genome data and cross-species analyses easily accessible through a web interface and a comprehensive API. Here we introduce GenomeHubs, which provide a containerised environment to facilitate the setup and hosting of custom Ensembl genome browsers. This simplifies mirroring of existing content and import of new genomic data into the Ensembl database schema.GenomeHubs also provide a set of analysis containers to decorate imported genomes with results of standard analyses and functional annotations and support export to flat files, including EMBL format for submission of assemblies and annotations to INSDC.Database URL: http://GenomeHubs.org


2020 ◽  
Author(s):  
Hayley R. Stoneman ◽  
Russell L. Wrobel ◽  
Michael Place ◽  
Michael Graham ◽  
David J. Krause ◽  
...  

AbstractCRISPR/Cas9 is a powerful tool for editing genomes, but design decisions are generally made with respect to a single reference genome. With population genomic data becoming available for an increasing number of model organisms, researchers are interested in manipulating multiple strains and lines. CRISpy-pop is a web application that generates and filters guide RNA sequences for CRISPR/Cas9 genome editing for diverse yeast and bacterial strains. The current implementation designs and predicts the activity of guide RNAs against more than 1000 Saccharomyces cerevisiae genomes, including 167 strains frequently used in bioenergy research. Zymomonas mobilis, an increasingly popular bacterial bioenergy research model, is also supported. CRISpy-pop is available as a web application (https://CRISpy-pop.glbrc.org/) with an intuitive graphical user interface. CRISpy-pop also cross-references the human genome to allow users to avoid the selection of sgRNAs with potential biosafety concerns. Additionally, CRISpy-pop predicts the strain coverage of each guide RNA within the supported strain sets, which aids in functional population genetic studies. Finally, we validate how CRISpy-pop can accurately predict the activity of guide RNAs across strains using population genomic data.


Sign in / Sign up

Export Citation Format

Share Document