scholarly journals Profiling and leveraging relatedness in a precision medicine cohort of 92,455 exomes

2017 ◽  
Author(s):  
Jeffrey Staples ◽  
Evan K. Maxwell ◽  
Nehal Gosalia ◽  
Claudia Gonzaga-Jauregui ◽  
Christopher Snyder ◽  
...  

AbstractLarge-scale human genetics studies are ascertaining increasing proportions of populations as they continue growing in both number and scale. As a result, the amount of cryptic relatedness within these study cohorts is growing rapidly and has significant implications on downstream analyses. We demonstrate this growth empirically among the first 92,455 exomes from the DiscovEHR cohort and, via a custom simulation framework we developed called SimProgeny, show that these measures are in-line with expectations given the underlying population and ascertainment approach. For example, we identified ∼66,000 close (first- and second-degree) relationships within DiscovEHR involving 55.6% of study participants. Our simulation results project that >70% of the cohort will be involved in these close relationships as DiscovEHR scales to 250,000 recruited individuals. We reconstructed 12,574 pedigrees using these relationships (including 2,192 nuclear families) and leveraged them for multiple applications. The pedigrees substantially improved the phasing accuracy of 20,947 rare, deleterious compound heterozygous mutations. Reconstructed nuclear families were critical for identifying 3,415 de novo mutations in ∼1,783 genes. Finally, we demonstrate the segregation of known and suspected disease-causing mutations through reconstructed pedigrees, including a tandem duplication in LDLR causing familial hypercholesterolemia. In summary, this work highlights the prevalence of cryptic relatedness expected among large healthcare population genomic studies and demonstrates several analyses that are uniquely enabled by large amounts of cryptic relatedness.


2020 ◽  
Vol 46 (1) ◽  
pp. 55-69 ◽  
Author(s):  
Veronica B. Searles Quick ◽  
Belinda Wang ◽  
Matthew W. State

Abstract“Big data” approaches in the form of large-scale human genomic studies have led to striking advances in autism spectrum disorder (ASD) genetics. Similar to many other psychiatric syndromes, advances in genotyping technology, allowing for inexpensive genome-wide assays, has confirmed the contribution of polygenic inheritance involving common alleles of small effect, a handful of which have now been definitively identified. However, the past decade of gene discovery in ASD has been most notable for the application, in large family-based cohorts, of high-density microarray studies of submicroscopic chromosomal structure as well as high-throughput DNA sequencing—leading to the identification of an increasingly long list of risk regions and genes disrupted by rare, de novo germline mutations of large effect. This genomic architecture offers particular advantages for the illumination of biological mechanisms but also presents distinctive challenges. While the tremendous locus heterogeneity and functional pleiotropy associated with the more than 100 identified ASD-risk genes and regions is daunting, a growing armamentarium of comprehensive, large, foundational -omics databases, across species and capturing developmental trajectories, are increasingly contributing to a deeper understanding of ASD pathology.



2020 ◽  
Author(s):  
Salvador Guardiola ◽  
Monica Varese ◽  
Xavier Roig ◽  
Jesús Garcia ◽  
Ernest Giralt

<p>NOTE: This preprint has been retracted by consensus from all authors. See the retraction notice in place above; the original text can be found under "Version 1", accessible from the version selector above.</p><p><br></p><p>------------------------------------------------------------------------</p><p><br></p><p>Peptides, together with antibodies, are among the most potent biochemical tools to modulate challenging protein-protein interactions. However, current structure-based methods are largely limited to natural peptides and are not suitable for designing target-specific binders with improved pharmaceutical properties, such as macrocyclic peptides. Here we report a general framework that leverages the computational power of Rosetta for large-scale backbone sampling and energy scoring, followed by side-chain composition, to design heterochiral cyclic peptides that bind to a protein surface of interest. To showcase the applicability of our approach, we identified two peptides (PD-<i>i</i>3 and PD-<i>i</i>6) that target PD-1, a key immune checkpoint, and work as protein ligand decoys. A comprehensive biophysical evaluation confirmed their binding mechanism to PD-1 and their inhibitory effect on the PD-1/PD-L1 interaction. Finally, elucidation of their solution structures by NMR served as validation of our <i>de novo </i>design approach. We anticipate that our results will provide a general framework for designing target-specific drug-like peptides.<i></i></p>



2020 ◽  
Author(s):  
Salvador Guardiola ◽  
Monica Varese ◽  
Xavier Roig ◽  
Jesús Garcia ◽  
Ernest Giralt

<p>NOTE: This preprint has been retracted by consensus from all authors. See the retraction notice in place above; the original text can be found under "Version 1", accessible from the version selector above.</p><p><br></p><p>------------------------------------------------------------------------</p><p><br></p><p>Peptides, together with antibodies, are among the most potent biochemical tools to modulate challenging protein-protein interactions. However, current structure-based methods are largely limited to natural peptides and are not suitable for designing target-specific binders with improved pharmaceutical properties, such as macrocyclic peptides. Here we report a general framework that leverages the computational power of Rosetta for large-scale backbone sampling and energy scoring, followed by side-chain composition, to design heterochiral cyclic peptides that bind to a protein surface of interest. To showcase the applicability of our approach, we identified two peptides (PD-<i>i</i>3 and PD-<i>i</i>6) that target PD-1, a key immune checkpoint, and work as protein ligand decoys. A comprehensive biophysical evaluation confirmed their binding mechanism to PD-1 and their inhibitory effect on the PD-1/PD-L1 interaction. Finally, elucidation of their solution structures by NMR served as validation of our <i>de novo </i>design approach. We anticipate that our results will provide a general framework for designing target-specific drug-like peptides.<i></i></p>



2020 ◽  
Author(s):  
Salvador Guardiola ◽  
Monica Varese ◽  
Xavier Roig ◽  
Jesús Garcia ◽  
Ernest Giralt

<p>NOTE: This preprint has been retracted by consensus from all authors. See the retraction notice in place above; the original text can be found under "Version 1", accessible from the version selector above.</p><p><br></p><p>------------------------------------------------------------------------</p><p><br></p><p>Peptides, together with antibodies, are among the most potent biochemical tools to modulate challenging protein-protein interactions. However, current structure-based methods are largely limited to natural peptides and are not suitable for designing target-specific binders with improved pharmaceutical properties, such as macrocyclic peptides. Here we report a general framework that leverages the computational power of Rosetta for large-scale backbone sampling and energy scoring, followed by side-chain composition, to design heterochiral cyclic peptides that bind to a protein surface of interest. To showcase the applicability of our approach, we identified two peptides (PD-<i>i</i>3 and PD-<i>i</i>6) that target PD-1, a key immune checkpoint, and work as protein ligand decoys. A comprehensive biophysical evaluation confirmed their binding mechanism to PD-1 and their inhibitory effect on the PD-1/PD-L1 interaction. Finally, elucidation of their solution structures by NMR served as validation of our <i>de novo </i>design approach. We anticipate that our results will provide a general framework for designing target-specific drug-like peptides.<i></i></p>



2021 ◽  
Vol 6 (1) ◽  
Author(s):  
Brent S. Pedersen ◽  
Joe M. Brown ◽  
Harriet Dashnow ◽  
Amelia D. Wallace ◽  
Matt Velinder ◽  
...  

AbstractIn studies of families with rare disease, it is common to screen for de novo mutations, as well as recessive or dominant variants that explain the phenotype. However, the filtering strategies and software used to prioritize high-confidence variants vary from study to study. In an effort to establish recommendations for rare disease research, we explore effective guidelines for variant (SNP and INDEL) filtering and report the expected number of candidates for de novo dominant, recessive, and autosomal dominant modes of inheritance. We derived these guidelines using two large family-based cohorts that underwent whole-genome sequencing, as well as two family cohorts with whole-exome sequencing. The filters are applied to common attributes, including genotype-quality, sequencing depth, allele balance, and population allele frequency. The resulting guidelines yield ~10 candidate SNP and INDEL variants per exome, and 18 per genome for recessive and de novo dominant modes of inheritance, with substantially more candidates for autosomal dominant inheritance. For family-based, whole-genome sequencing studies, this number includes an average of three de novo, ten compound heterozygous, one autosomal recessive, four X-linked variants, and roughly 100 candidate variants following autosomal dominant inheritance. The slivar software we developed to establish and rapidly apply these filters to VCF files is available at https://github.com/brentp/slivar under an MIT license, and includes documentation and recommendations for best practices for rare disease analysis.



2020 ◽  
Vol 12 (1) ◽  
Author(s):  
Ahmad N. Abou Tayoun ◽  
Heidi L. Rehm

AbstractWe highlight the current lack of representation of the Middle East from large genomic studies and emphasize the expected high impact of cataloging its variation. We discuss the limiting factors and possible solutions to generating and accessing research and clinical sequencing data from this part of the world.



2021 ◽  
Vol 22 (1) ◽  
Author(s):  
Daniel E. Runcie ◽  
Jiayi Qu ◽  
Hao Cheng ◽  
Lorin Crawford

AbstractLarge-scale phenotype data can enhance the power of genomic prediction in plant and animal breeding, as well as human genetics. However, the statistical foundation of multi-trait genomic prediction is based on the multivariate linear mixed effect model, a tool notorious for its fragility when applied to more than a handful of traits. We present , a statistical framework and associated software package for mixed model analyses of a virtually unlimited number of traits. Using three examples with real plant data, we show that can leverage thousands of traits at once to significantly improve genetic value prediction accuracy.



2021 ◽  
Vol 14 (1) ◽  
Author(s):  
Daniel Stribling ◽  
Peter L. Chang ◽  
Justin E. Dalton ◽  
Christopher A. Conow ◽  
Malcolm Rosenthal ◽  
...  

Abstract Objectives Arachnids have fascinating and unique biology, particularly for questions on sex differences and behavior, creating the potential for development of powerful emerging models in this group. Recent advances in genomic techniques have paved the way for a significant increase in the breadth of genomic studies in non-model organisms. One growing area of research is comparative transcriptomics. When phylogenetic relationships to model organisms are known, comparative genomic studies provide context for analysis of homologous genes and pathways. The goal of this study was to lay the groundwork for comparative transcriptomics of sex differences in the brain of wolf spiders, a non-model organism of the pyhlum Euarthropoda, by generating transcriptomes and analyzing gene expression. Data description To examine sex-differential gene expression, short read transcript sequencing and de novo transcriptome assembly were performed. Messenger RNA was isolated from brain tissue of male and female subadult and mature wolf spiders (Schizocosa ocreata). The raw data consist of sequences for the two different life stages in each sex. Computational analyses on these data include de novo transcriptome assembly and differential expression analyses. Sample-specific and combined transcriptomes, gene annotations, and differential expression results are described in this data note and are available from publicly-available databases.



PEDIATRICS ◽  
1963 ◽  
Vol 32 (3) ◽  
pp. 344-346

Recommendations were made in view of the following facts: (1) the need for further information on the mechanisms involved in the phenotypic expressions of phenylketonuria; (2) the present lack of adequate data on the effectiveness of the Guthrie Inhibition Assay, in terms of number of cases which may be missed, factors making for positive determinations and providing other information on which to evaluate the appropriateness of the large-scale screening program proposed; (3) the undesirability of deploying inordinate resources in the evaluation of the Guthrie Inhibition Assay to the detriment of the needs of other areas of child health including phenylketonuria; (4) the indications that a multi-faceted approach to phenylketonuria would be productive, not only in resolving the problems involving this disorder but also as a model for the investigation of and application to the treatment of other genetic diseases; (5) the possibility that the Guthrie Inhibition Assay could be a useful tool in the early detection, treatment and investigation of phenylketonuria; and (6) the fact that other state health departments are participating in the Guthrie Field Trials, indicating that the California State Department of Public Health should apply its resources to a more intensive study of PKU and detection methods. The consultants made the following recommendations, through resolution, to the California State Department of Public Health. It was resolved that: 1. The State of California not be responsible at this time for initiating or recommending that the Guthrie procedure be accomplished on a state-wide basis in all newborn nurseries (one dissent). 2. The State of California initiate and coordinate the development of pilot studies in selected hospitals and medical centers throughout the State in the investigation of phenylketonuria, utilizing the Guthrie Inhibition Assay or other tests. 3. A scientific committee be appointed immediately as an advisory committee to the State Department of Public Health to develop recommendations for carrying out the suggested investigations. 4. A registry for phenylketonuria and other diseases (as listed in the recommendations by the Subcommittee on Human Genetics) be established within the framework of the State organization.



Sign in / Sign up

Export Citation Format

Share Document