scholarly journals Reference exome data for a Northern Brazilian population

2020 ◽  
Vol 7 (1) ◽  
Author(s):  
Alexia L. Weeks ◽  
Richard W. Francis ◽  
Joao I. C. F. Neri ◽  
Nathaly M. C. Costa ◽  
Nivea M. R. Arrais ◽  
...  

Abstract Exome sequencing is widely used in the diagnosis of rare genetic diseases and provides useful variant data for analysis of complex diseases. There is not always adequate population-specific reference data to assist in assigning a diagnostic variant to a specific clinical condition. Here we provide a catalogue of variants called after sequencing the exomes of 45 babies from Rio Grande do Nord in Brazil. Sequence data were processed using an ‘intersect-then-combine’ (ITC) approach, using GATK and SAMtools to call variants. A total of 612,761 variants were identified in at least one individual in this Brazilian Cohort, including 559,448 single nucleotide variants (SNVs) and 53,313 insertion/deletions. Of these, 58,111 overlapped with nonsynonymous (nsSNVs) or splice site (ssSNVs) SNVs in dbNSFP. As an aid to clinical diagnosis of rare diseases, we used the American College of Medicine Genetics and Genomics (ACMG) guidelines to assign pathogenic/likely pathogenic status to 185 (0.32%) of the 58,111 nsSNVs and ssSNVs. Our data set provides a useful reference point for diagnosis of rare diseases in Brazil. (169 words).

2021 ◽  
Vol 12 (1) ◽  
Author(s):  
Yavor K. Bozhilov ◽  
Damien J. Downes ◽  
Jelena Telenius ◽  
A. Marieke Oudelaar ◽  
Emmanuel N. Olivier ◽  
...  

AbstractMany single nucleotide variants (SNVs) associated with human traits and genetic diseases are thought to alter the activity of existing regulatory elements. Some SNVs may also create entirely new regulatory elements which change gene expression, but the mechanism by which they do so is largely unknown. Here we show that a single base change in an otherwise unremarkable region of the human α-globin cluster creates an entirely new promoter and an associated unidirectional transcript. This SNV downregulates α-globin expression causing α-thalassaemia. Of note, the new promoter lying between the α-globin genes and their associated super-enhancer disrupts their interaction in an orientation-dependent manner. Together these observations show how both the order and orientation of the fundamental elements of the genome determine patterns of gene expression and support the concept that active genes may act to disrupt enhancer-promoter interactions in mammals as in Drosophila. Finally, these findings should prompt others to fully evaluate SNVs lying outside of known regulatory elements as causing changes in gene expression by creating new regulatory elements.


2021 ◽  
Author(s):  
Maria Koromina ◽  
Vasileios Fanaras ◽  
Gareth Baynam ◽  
Christina Mitropoulou ◽  
George P Patrinos

Rapid advances in next-generation sequencing technology, particularly whole exome sequencing and whole genome sequencing, have greatly affected our understanding of genetic variation underlying rare genetic diseases. Herein, we describe ethical principles of guiding consent and sharing of genomics research data. We also discuss ethical dilemmas in rare diseases research and patient recruitment policies and address bioethical and societal aspects influencing the ethical framework for genetic testing. Moreover, we focus on addressing ethical issues surrounding research in low- and middle-income countries. Overall, this perspective aims to address key aspects and issues for building proper ethical frameworks, when conducting research involving genomics data with a particular emphasis on rare diseases and genetics testing.


Blood ◽  
2015 ◽  
Vol 126 (23) ◽  
pp. 4207-4207
Author(s):  
Brian S White ◽  
Irena Lanc ◽  
Daniel Auclair ◽  
Robert Fulton ◽  
Mark A Fiala ◽  
...  

Abstract Background: Multiple myeloma (MM) is a hematologic cancer characterized by a diversity of genetic lesions-translocations, copy number alterations (CNAs), and single nucleotide variants (SNVs). The prognostic value of translocations and of CNAs has been well established. Determining the clinical significance of SNVs, which are recurrently mutated at much lower frequencies, and how this significance is impacted by translocations and CNAs requires additional, large-scale correlative studies. Such studies can be facilitated by cost-effective targeted sequencing approaches. Hence, we designed a single-platform targeted sequencing approach capable of detecting all three variant types. Methods: We designed oligonucleotide probes complementary to the coding regions of 467 genes and to the IgH and MYC loci, allowing a probe to closely match at most 5 regions within the genome. Genes were selected if they were expressed in an independent RNA-seq MM data set and harbored germline SNP-filtered variants that: (1) occurred with frequency >3%, (2) were clustered in hotspots, (3) occurred in recurrently mutated "cancer genes" (as annotated in COSMIC or MutSig), or (4) occurred in genes involved in DNA repair and/or B-cell biology. IgH and MYC tiling was unbiased (with respect to annotated features within the loci) and spanned from 50 kilobasepairs (kbps) upstream of both regions to 50 kbps downstream of IgH and 100 kbps downstream of MYC. Results: We performed targeted sequencing of 96 CD138-enriched samples derived from MM patients, as well as matched peripheral blood leukocyte normal controls. Sequencing depth (mean 107X) was commensurate with that of available exome sequencing data from these samples (mean 71X). Samples harbored a mean of 25 non-silent variants, including those in known MM-associated genes: NRAS (24%), KRAS (22%), FAM46C (17%), TP53 (10%), DIS3 (8%), and BRAF (3%). Variants detected by both platforms showed a strong correlation (r^2 = 0.8). The capture array detected activating, oncogenic variants in NRAS Q61K (n=3 patients) and KRAS G12C/D/R/V (n=5) that were not detected in exome data. Additionally, we found non-silent, capture-specific variants in MTOR (3%) and in two transcription-related genes that have been previously implicated in cancer: ZFHX4 (5%) and CHD3 (5%). To assess the potential role of deep subclonal variants and our ability to detect them, we performed additional sequencing (mean 565X) on six of the tumor/normal pairs. This revealed 14 manually-reviewed, non-silent variants that were not detected by the initial targeted sequencing. These had a mean variant allele frequency of 2.8% and included mutations in DNMT3A and FAM46C. At least one of these 14 variants occurred in five of the six re-sequenced samples. This highlights the importance of this additional depth, which will be used in future studies. Our approach successfully detected CNAs near expected frequencies, including hyperdiploidy (52%), del(13) (43%), and gain of 1q (35%). Similarly, it inferred IgH translocations at expected frequencies: t(4;14) (14%), t(6;14) (3%), t(11;14) (15%), and t(14;20) (1%). As expected, translocations occur predominantly within the IgH constant region, but also frequently 5' (i.e., telomeric) of the IGHM switch region, and occasionally within the V and D regions. We detected MYC -associated translocations, whose frequencies have been the subject of debate, at 10% (n=9 patients), with five involving IgH, three having both partners in or near MYC, and one having both types. Finally, our platform detected novel IgH translocations with partners near DERL3 (n=2), MYCN (n=1), and FLT3 (n=1). Additional evidence suggests that DERL3 and MYCN may be targets of IgH-induced overexpression: of 84 RNA-seq patient samples, six exhibited outlying expression of DERL3, including one sample in which we detected the translocation in corresponding DNA, and one exhibited outlying expression of MYCN. Conclusion: Our MM-specific targeted sequencing strategy is capable of detecting deeply subclonal SNVs, in addition to CNAs and IgH and MYC translocations. Though additional validation is required, particularly with respect to translocation detection, we anticipate that such technology will soon enable clinical testing on a single sequencing platform. Disclosures Vij: Celgene, Onyx, Takeda, Novartis, BMS, Sanofi, Janssen, Merck: Consultancy; Takeda, Onyx: Research Funding.


Author(s):  
S. G. Vorsanova ◽  
Yu. B. Yurov ◽  
V. Yu. Voinova ◽  
I. Yu. Yurov

This review presents the theoretical, practical and geographical aspects of Rett syndrome and other rare diseases, according to the data of the last VIII International Congress in Russia, and the main publications on Rett syndrome. The issues highlighted by the participants remain relevant and determine the direction of modern studies. The presentations made at the symposium helped to form a global concept of the molecular and cellular mechanisms of Rett syndrome and a number of rare genetic/genomic diseases. The article presents a number of domestic findings in the field of Rett syndrome and other rare diseases. The authors also present information on rare diseases associated with the Rett-like-phenotype or with mutations/variations of the MECP2 gene sequence copies. The authors consider the identified chromosomal (genomic) disorders / diseases in the context of rare diseases. This approach to the Rett syndrome studies analysis is quite new in the world research practice. We hope this review to become valuable not only for specialists in the field of rare genetic diseases, but also for the scientists and clinicians studying Rett syndrome and for physicians (pediatricians, geneticists, neurologists, psychiatrists) meeting these patients in their practice.


2021 ◽  
Author(s):  
Kishwar Shafin ◽  
Trevor Pesout ◽  
Pi-Chuan Chang ◽  
Maria Nattestad ◽  
Alexey Kolesnikov ◽  
...  

Long-read sequencing has the potential to transform variant detection by reaching currently difficult-to-map regions and routinely linking together adjacent variations to enable read based phasing. Third-generation nanopore sequence data has demonstrated a long read length, but current interpretation methods for its novel pore-based signal have unique error profiles, making accurate analysis challenging. Here, we introduce a haplotype-aware variant calling pipeline PEPPER-Margin-DeepVariant that produces state-of-the-art variant calling results with nanopore data. We show that our nanopore-based method outperforms the short-read-based single nucleotide variant identification method at the whole genome-scale and produces high-quality single nucleotide variants in segmental duplications and low-mappability regions where short-read based genotyping fails. We show that our pipeline can provide highly-contiguous phase blocks across the genome with nanopore reads, contiguously spanning between 85% to 92% of annotated genes across six samples. We also extend PEPPER-Margin-DeepVariant to PacBio HiFi data, providing an efficient solution with superior performance than the current WhatsHap-DeepVariant standard. Finally, we demonstrate de novo assembly polishing methods that use nanopore and PacBio HiFi reads to produce diploid assemblies with high accuracy (Q35+ nanopore-polished and Q40+ PacBio-HiFi-polished).


2021 ◽  
Author(s):  
Jian Yang ◽  
Cong Dong ◽  
Huilong Duan ◽  
Qiang Shu ◽  
Haomin Li

Abstract Background: The complexity of the phenotypic characteristics and molecular bases of many rare human genetic diseases makes the diagnosis of such diseases a challenge for clinicians. A map for visualizing, locating and navigating rare diseases based on similarity will help clinicians and researchers understand and easily explore these diseases. Methods: A distance matrix of rare diseases included in Orphanet was measured by calculating the quantitative distance among phenotypes and pathogenic genes based on Human Phenotype Ontology (HPO) and Gene Ontology (GO), and each disease was mapped into Euclidean space. A rare disease map, enhanced by clustering classes and disease information, was developed based on ECharts. Results: A rare disease map called RDmap was published at http://rdmap.nbscn.org. Total 3,287 rare diseases are included in the phenotype-based map, and 3,789 rare genetic diseases are included in the gene-based map; 1,718 overlapping diseases are connected between two maps. RDmap works similarly to the widely used Google Map service and supports zooming and panning. The phenotype similarity base disease location function performed better than traditional keyword searches in an in silico evaluation, and 20 published cases of rare diseases also demonstrated that RDmap can assist clinicians in seeking the rare disease diagnosis. Conclusion: RDmap is the first user-interactive map-style rare disease knowledgebase. It will help clinicians and researchers explore the increasingly complicated realm of rare genetic diseases.


2019 ◽  
Vol 36 (7) ◽  
pp. 2295-2297
Author(s):  
Christina Nieuwoudt ◽  
Angela Brooks-Wilson ◽  
Jinko Graham

Abstract Summary We present the R package SimRVSequences to simulate sequence data for pedigrees. SimRVSequences allows for simulations of large numbers of single-nucleotide variants (SNVs) and scales well with increasing numbers of pedigrees. Users provide a sample of pedigrees and SNV data from a sample of unrelated individuals. Availability and implementation SimRVSequences is publicly-available on CRAN https://cran.r-project.org/web/packages/SimRVSequences/. Supplementary information Supplementary data are available at Bioinformatics online.


2019 ◽  
Vol 13 (1) ◽  
Author(s):  
Sridhar Sivasubbu ◽  
◽  
Vinod Scaria

Abstract Home to a culturally heterogeneous population, India is also a melting pot of genetic diversity. The population architecture characterized by multiple endogamous groups with specific marriage patterns, including the widely prevalent practice of consanguinity, not only makes the Indian population distinct from rest of the world but also provides a unique advantage and niche to understand genetic diseases. Centuries of genetic isolation of population groups have amplified the founder effects, contributing to high prevalence of recessive alleles, which translates into genetic diseases, including rare genetic diseases in India. Rare genetic diseases are becoming a public health concern in India because a large population size of close to a billion people would essentially translate to a huge disease burden for even the rarest of the rare diseases. Genomics-based approaches have been demonstrated to accelerate the diagnosis of rare genetic diseases and reduce the socio-economic burden. The Genomics for Understanding Rare Diseases: India Alliance Network (GUaRDIAN) stands for providing genomic solutions for rare diseases in India. The consortium aims to establish a unique collaborative framework in health care planning, implementation, and delivery in the specific area of rare genetic diseases. It is a nation-wide collaborative research initiative catering to rare diseases across multiple cohorts, with over 240 clinician/scientist collaborators across 70 major medical/research centers. Within the GUaRDIAN framework, clinicians refer rare disease patients, generate whole genome or exome datasets followed by computational analysis of the data for identifying the causal pathogenic variations. The outcomes of GUaRDIAN are being translated as community services through a suitable platform providing low-cost diagnostic assays in India. In addition to GUaRDIAN, several genomic investigations for diseased and healthy population are being undertaken in the country to solve the rare disease dilemma. In summary, rare diseases contribute to a significant disease burden in India. Genomics-based solutions can enable accelerated diagnosis and management of rare diseases. We discuss how a collaborative research initiative such as GUaRDIAN can provide a nation-wide framework to cater to the rare disease community of India.


2021 ◽  
Vol 16 (1) ◽  
Author(s):  
Jian Yang ◽  
Cong Dong ◽  
Huilong Duan ◽  
Qiang Shu ◽  
Haomin Li

Abstract Background The complexity of the phenotypic characteristics and molecular bases of many rare human genetic diseases makes the diagnosis of such diseases a challenge for clinicians. A map for visualizing, locating and navigating rare diseases based on similarity will help clinicians and researchers understand and easily explore these diseases. Methods A distance matrix of rare diseases included in Orphanet was measured by calculating the quantitative distance among phenotypes and pathogenic genes based on Human Phenotype Ontology (HPO) and Gene Ontology (GO), and each disease was mapped into Euclidean space. A rare disease map, enhanced by clustering classes and disease information, was developed based on ECharts. Results A rare disease map called RDmap was published at http://rdmap.nbscn.org. Total 3287 rare diseases are included in the phenotype-based map, and 3789 rare genetic diseases are included in the gene-based map; 1718 overlapping diseases are connected between two maps. RDmap works similarly to the widely used Google Map service and supports zooming and panning. The phenotype similarity base disease location function performed better than traditional keyword searches in an in silico evaluation, and 20 published cases of rare diseases also demonstrated that RDmap can assist clinicians in seeking the rare disease diagnosis. Conclusion RDmap is the first user-interactive map-style rare disease knowledgebase. It will help clinicians and researchers explore the increasingly complicated realm of rare genetic diseases.


2021 ◽  
Vol 49 (5) ◽  
pp. 2835-2847
Author(s):  
Antto J Norppa ◽  
Mikko J Frilander

Abstract Disruption of minor spliceosome functions underlies several genetic diseases with mutations in the minor spliceosome-specific small nuclear RNAs (snRNAs) and proteins. Here, we define the molecular outcome of the U12 snRNA mutation (84C>U) resulting in an early-onset form of cerebellar ataxia. To understand the molecular consequences of the U12 snRNA mutation, we created cell lines harboring the 84C>T mutation in the U12 snRNA gene (RNU12). We show that the 84C>U mutation leads to accelerated decay of the snRNA, resulting in significantly reduced steady-state U12 snRNA levels. Additionally, the mutation leads to accumulation of 3′-truncated forms of U12 snRNA, which have undergone the cytoplasmic steps of snRNP biogenesis. Our data suggests that the 84C>U-mutant snRNA is targeted for decay following reimport into the nucleus, and that the U12 snRNA fragments are decay intermediates that result from the stalling of a 3′-to-5′ exonuclease. Finally, we show that several other single-nucleotide variants in the 3′ stem-loop of U12 snRNA that are segregating in the human population are also highly destabilizing. This suggests that the 3′ stem-loop is important for the overall stability of the U12 snRNA and that additional disease-causing mutations are likely to exist in this region.


Sign in / Sign up

Export Citation Format

Share Document