scholarly journals The GA4GH Phenopacket schema: A computable representation of clinical data for precision medicine

Author(s):  
Julius OB Jacobsen ◽  
Michael Baudis ◽  
Gareth S Baynam ◽  
Jacques S Beckmann ◽  
Sergi Beltran ◽  
...  

Despite great strides in the development and wide acceptance of standards for exchanging structured information about genomic variants, there is no corresponding standard for exchanging phenotypic data, and this has impeded the sharing of phenotypic information for computational analysis. Here, we introduce the Global Alliance for Genomics and Health (GA4GH) Phenopacket schema, which supports exchange of computable longitudinal case-level phenotypic information for diagnosis and research of all types of disease including Mendelian and complex genetic diseases, cancer, and infectious diseases. To support translational research, diagnostics, and personalized healthcare, phenopackets are designed to be used across a comprehensive landscape of applications including biobanks, databases and registries, clinical information systems such as Electronic Health Records, genomic matchmaking, diagnostic laboratories, and computational tools. The Phenopacket schema is a freely available, community-driven standard that streamlines exchange and systematic use of phenotypic data and will facilitate sophisticated computational analysis of both clinical and genomic information to help improve our understanding of diseases and our ability to manage them.

2010 ◽  
Vol 26 (9) ◽  
pp. 1219-1224 ◽  
Author(s):  
Yongjin Li ◽  
Jagdish C. Patra

Abstract Motivation: Clinical diseases are characterized by distinct phenotypes. To identify disease genes is to elucidate the gene–phenotype relationships. Mutations in functionally related genes may result in similar phenotypes. It is reasonable to predict disease-causing genes by integrating phenotypic data and genomic data. Some genetic diseases are genetically or phenotypically similar. They may share the common pathogenetic mechanisms. Identifying the relationship between diseases will facilitate better understanding of the pathogenetic mechanism of diseases. Results: In this article, we constructed a heterogeneous network by connecting the gene network and phenotype network using the phenotype–gene relationship information from the OMIM database. We extended the random walk with restart algorithm to the heterogeneous network. The algorithm prioritizes the genes and phenotypes simultaneously. We use leave-one-out cross-validation to evaluate the ability of finding the gene–phenotype relationship. Results showed improved performance than previous works. We also used the algorithm to disclose hidden disease associations that cannot be found by gene network or phenotype network alone. We identified 18 hidden disease associations, most of which were supported by literature evidence. Availability: The MATLAB code of the program is available at http://www3.ntu.edu.sg/home/aspatra/research/Yongjin_BI2010.zip Contact: [email protected] Supplementary information: Supplementary data are available at Bioinformatics online.


2006 ◽  
Vol 12 ◽  
pp. S2
Author(s):  
Aravinda Chakravarti

Author(s):  
Michael Snyder

What is a complex genetic disease? Although great strides have been made to identify single gene variants that have a strong causative effect for a particular disease (e.g., CFTR mutations for cystic fibrosis and HEXA mutations for Tay-Sachs disease), the...


2014 ◽  
Vol 64 (Pt_2) ◽  
pp. 357-365 ◽  
Author(s):  
Gilda Rose S. Amaral ◽  
Graciela M. Dias ◽  
Michiyo Wellington-Oguri ◽  
Luciane Chimetto ◽  
Mariana E. Campeão ◽  
...  

Vibrios are ubiquitous in the aquatic environment and can be found in association with animal or plant hosts. The range of ecological relationships includes pathogenic and mutualistic associations. To gain a better understanding of the ecology of these microbes, it is important to determine their phenotypic features. However, the traditional phenotypic characterization of vibrios has been expensive, time-consuming and restricted in scope to a limited number of features. In addition, most of the commercial systems applied for phenotypic characterization cannot characterize the broad spectrum of environmental strains. A reliable and possible alternative is to obtain phenotypic information directly from whole genome sequences. The aim of the present study was to evaluate the usefulness of whole genome sequences as a source of phenotypic information. We performed a comparison of the vibrio phenotypes obtained from the literature with the phenotypes obtained from whole genome sequences. We observed a significant correlation between the previously published phenotypic data and the phenotypic data retrieved from whole genome sequences of vibrios. Analysis of 26 vibrio genomes revealed that all genes coding for the specific proteins involved in the metabolic pathways responsible for positive phenotypes of the 14 diagnostic features (Voges–Proskauer reaction, indole production, arginine dihydrolase, ornithine decarboxylase, utilization of myo-inositol, sucrose and l-leucine, and fermentation of d-mannitol, d-sorbitol, l-arabinose, trehalose, cellobiose, d-mannose and d-galactose) were found in the majority of the vibrios genomes. Vibrio species that were negative for a given phenotype revealed the absence of all or several genes involved in the respective biochemical pathways, indicating the utility of this approach to characterize the phenotypes of vibrios. The absence of the global regulation and regulatory proteins in the Vibrio parahaemolyticus genome indicated a non-vibrio phenotype. Whole genome sequences represent an important source for the phenotypic identification of vibrios.


2008 ◽  
Vol 30 (3) ◽  
pp. 4-6
Author(s):  
Leonard W. Seymour

Everybody understands gene therapy. It's where healthy genes are used to supplement the lossoffunction of mutated genes, providing simple cures for complex genetic diseases. Because it clearly works (otherwise why is it called ‘gene therapy’?) there is a roller coaster of public perception, varying between enthusiastic optimism in times of dramatic progress and scathing criticism when things go badly. A priority for those of us working in the field is to regulate expectation by providing access to more balanced and intelligible information, and that is one goal of the recently formed Brit ish Society for Gene Therapy (www.bsgt.org).


Author(s):  
Amy Brower ◽  
Kee Chan ◽  
Michael Hartnett ◽  
Jennifer Taylor

The goal of newborn screening is to improve health outcomes by identifying and treating affected newborns. This manuscript provides an overview of a data tool to facilitate the longitudinal collection of health information on newborns diagnosed with a condition through NBS. The Newborn Screening Translational Research Network (NBSTRN) developed the Longitudinal Pediatric Data Resource (LPDR) to capture, store, analyze, visualize, and share genomic and phenotypic data over the lifespan of NBS identified newborns to facilitate understanding of genetic disease, and to assess the impact of early identification and treatment. NBSTRN developed a consensus-based process using clinical care experts to create, maintain, and evolve question and answer sets organized into common data elements (CDEs). The LPDR contains 24,172 core and disease-specific CDEs for 118 rare genetic diseases, and the CDEs are being made available through the NIH CDE Repository. The number of CDEs for each condition average of 2,200 with a range from 69 to 7,944. The LPDR is used by state NBS programs, clinical researchers, and community-based organizations. Case level, de-identified data sets are available for secondary research and data mining. The development of the LPDR for longitudinal data gathering, sharing, and analysis supports research and facilitates the translation of new discoveries into clinical practice.


Author(s):  
Lilit Nersisyan ◽  
Maria Ropat ◽  
Vicent Pelechano

ABSTRACTIn eukaryotes, 5’-3’ co-translation degradation machinery follows the last translating ribosome providing an in vivo footprint of its position. Thus 5’P degradome sequencing, in addition to informing about RNA decay, also provides valuable information regarding ribosome dynamics. Multiple experimental methods have been developed to investigate the mRNA degradome, however computational tools for their reproducible analysis are lacking. Here we present fivepseq: an easy-to-use application for analysis and interactive visualization of 5’P degradome data. This tool performs both metagene and gene specific analysis, and allows to easily investigate codon specific ribosome pauses. To demonstrate its ability to provide new biological information, we investigate gene specific ribosome pauses in S. cerevisiae after eIF5A depletion. In addition to identifying pauses at expected codon motifs, we identify multiple genes with strain-specific frameshifts. To show its wide applicability, we investigate more complex 5’P degradome from A. thaliana and discover both motif-specific ribosome protection associated with particular developmental stages, as well as generally increased ribosome protection at termination level associated with age. Our work shows how the use of improved analysis tools for the study of 5’P degradome can significantly increase the biological information that can be derived from such datasets and facilitate its reproducible analysis.KEY POINTSAnalysis of 5’P degradome data with fivepseq informs about global and gene-specific translational features.Frameshifts in translation-related genes in S. cerevisiae may be linked to ribosome stalling.Ribosome protection at termination and codon motifs are linked to development in A. Thaliana.


Sign in / Sign up

Export Citation Format

Share Document