scholarly journals Analysis of Brachypodium genomes with genome-wide optical maps

Genome ◽  
2018 ◽  
Vol 61 (8) ◽  
pp. 559-565 ◽  
Author(s):  
Tingting Zhu ◽  
Zhaorong Hu ◽  
Juan C. Rodriguez ◽  
Karin R. Deal ◽  
Jan Dvorak ◽  
...  

Brachypodium distachyon (n = 5) is a diploid and has been widely used as a genetic model. Brachypodium stacei (n = 10) and B. hybridum (n = 15) are species that are related to B. distachyon, leading to an hypothesis that they are part of a polyploid series based on x = 5. Several lines of evidence suggest that this hypothesis is incorrect and that the genomes of the three taxa may have evolved by a more complex process. We constructed an optical whole-genome BioNano genome (BNG) map for each species and did pairwise alignment of the BNG maps. The maps showed that B. distachyon and B. stacei are both diploid, in spite of B. stacei having twice as many chromosomes as B. distachyon, and that B. hybridum is an allopolyploid formed from hybridization between B. distachyon and B. stacei. This study also demonstrated the use of BNG maps in the detection and quantification of structural variants among the genomes.

2021 ◽  
Author(s):  
Marsha M. Wheeler ◽  
Adrienne M Stilp ◽  
Shuquan Rao ◽  
Bjarni V Halldorsson ◽  
Doruk V Beyter ◽  
...  

Genome-wide association studies (GWAS) have identified thousands of single nucleotide variants and small indels that contribute to the genetic architecture of hematologic traits. While structural variants (SVs) are known to cause rare blood or hematopoietic disorders, the genome-wide contribution of SVs to quantitative blood cell trait variation is unknown. Here we utilized SVs detected from whole genome sequencing (WGS) in ancestrally diverse participants of the NHLBI TOPMed program (N=50,675). Using single variant tests, we assessed the association of common and rare SVs with red cell-, white cell-, and platelet-related quantitative traits. The results show 33 independent SVs (23 common and 10 rare) reaching genome-wide significance. The majority of significant association signals (N=27) replicated in independent datasets from deCODE genetics and the UK BioBank. Moreover, most trait-associated SVs (N=24) are within 1Mb of previously-reported GWAS loci. SV analyses additionally discovered an association between a complex structural variant on 17p11.2 and white blood cell-related phenotypes. Based on functional annotation, the majority of significant SVs are located in non-coding regions (N=26) and predicted to impact regulatory elements and/or local chromatin domain boundaries in blood cells. We predict that several trait-associated SVs represent the causal variant. This is supported by genome-editing experiments which provide evidence that a deletion associated with lower monocyte counts leads to disruption of an S1PR3 monocyte enhancer and decreased S1PR3 expression.


2018 ◽  
Vol 13 (5) ◽  
pp. 536-552 ◽  
Author(s):  
Ankush Ashok Saddhe ◽  
Shweta ◽  
Kareem A. Mosa ◽  
Kundan Kumar ◽  
Manoj Prasad ◽  
...  

2021 ◽  
Vol 16 (1) ◽  
Author(s):  
Kingshuk Mukherjee ◽  
Massimiliano Rossi ◽  
Leena Salmela ◽  
Christina Boucher

AbstractGenome wide optical maps are high resolution restriction maps that give a unique numeric representation to a genome. They are produced by assembling hundreds of thousands of single molecule optical maps, which are called Rmaps. Unfortunately, there are very few choices for assembling Rmap data. There exists only one publicly-available non-proprietary method for assembly and one proprietary software that is available via an executable. Furthermore, the publicly-available method, by Valouev et al. (Proc Natl Acad Sci USA 103(43):15770–15775, 2006), follows the overlap-layout-consensus (OLC) paradigm, and therefore, is unable to scale for relatively large genomes. The algorithm behind the proprietary method, Bionano Genomics’ Solve, is largely unknown. In this paper, we extend the definition of bi-labels in the paired de Bruijn graph to the context of optical mapping data, and present the first de Bruijn graph based method for Rmap assembly. We implement our approach, which we refer to as rmapper, and compare its performance against the assembler of Valouev et al. (Proc Natl Acad Sci USA 103(43):15770–15775, 2006) and Solve by Bionano Genomics on data from three genomes: E. coli, human, and climbing perch fish (Anabas Testudineus). Our method was able to successfully run on all three genomes. The method of Valouev et al. (Proc Natl Acad Sci USA 103(43):15770–15775, 2006) only successfully ran on E. coli. Moreover, on the human genome rmapper was at least 130 times faster than Bionano Solve, used five times less memory and produced the highest genome fraction with zero mis-assemblies. Our software, rmapper is written in C++ and is publicly available under GNU General Public License at https://github.com/kingufl/Rmapper.


Genes ◽  
2020 ◽  
Vol 11 (10) ◽  
pp. 1154
Author(s):  
Min Jeong Hong ◽  
Jin-Baek Kim ◽  
Yong Weon Seo ◽  
Dae Yeon Kim

Genes of the F-box family play specific roles in protein degradation by post-translational modification in several biological processes, including flowering, the regulation of circadian rhythms, photomorphogenesis, seed development, leaf senescence, and hormone signaling. F-box genes have not been previously investigated on a genome-wide scale; however, the establishment of the wheat (Triticum aestivum L.) reference genome sequence enabled a genome-based examination of the F-box genes to be conducted in the present study. In total, 1796 F-box genes were detected in the wheat genome and classified into various subgroups based on their functional C-terminal domain. The F-box genes were distributed among 21 chromosomes and most showed high sequence homology with F-box genes located on the homoeologous chromosomes because of allohexaploidy in the wheat genome. Additionally, a synteny analysis of wheat F-box genes was conducted in rice and Brachypodium distachyon. Transcriptome analysis during various wheat developmental stages and expression analysis by quantitative real-time PCR revealed that some F-box genes were specifically expressed in the vegetative and/or seed developmental stages. A genome-based examination and classification of F-box genes provide an opportunity to elucidate the biological functions of F-box genes in wheat.


2021 ◽  
Vol 8 (1) ◽  
Author(s):  
Pierpaolo Maisano Delser ◽  
Eppie R. Jones ◽  
Anahit Hovhannisyan ◽  
Lara Cassidy ◽  
Ron Pinhasi ◽  
...  

AbstractOver the last few years, genome-wide data for a large number of ancient human samples have been collected. Whilst datasets of captured SNPs have been collated, high coverage shotgun genomes (which are relatively few but allow certain types of analyses not possible with ascertained captured SNPs) have to be reprocessed by individual groups from raw reads. This task is computationally intensive. Here, we release a dataset including 35 whole-genome sequenced samples, previously published and distributed worldwide, together with the genetic pipeline used to process them. The dataset contains 72,041,355 sites called across 19 ancient and 16 modern individuals and includes sequence data from four previously published ancient samples which we sequenced to higher coverage (10–18x). Such a resource will allow researchers to analyse their new samples with the same genetic pipeline and directly compare them to the reference dataset without re-processing published samples. Moreover, this dataset can be easily expanded to increase the sample distribution both across time and space.


2015 ◽  
Vol 117 (suppl_1) ◽  
Author(s):  
Matthew Wheeler ◽  
Daryl Waggott ◽  
Megan Grove ◽  
Frederick Dewey ◽  
Cuiping Pan ◽  
...  

Background: Technological advances have greatly reduced the cost of whole genome sequencing. For single individuals clinical application is apparent, while exome sequencing in tens of thousands of people has allowed a more global view of genetic variation that can inform interpretation of specific variants in individuals. We hypothesized that genome sequencing of patients with monogenic cardiomyopathy would facilitate discovery of genetic modifiers of phenotype. Methods and Results: We identified 48 individuals diagnosed with cardiomyopathy and with putative mutations in MYH7, the gene encoding beta myosin heavy chain. We carried out whole genome sequencing and applied a newly developed analytical pipeline optimized for discovery of genes modifying severity of clinical presentation and outcomes. Using a combination of external priors and rare variant burden tests we scored genes as potential modifiers. There were 96 genes that reached a modifier score of 6 out of 12 or better (9=2, 8=8, 7=17, 6=69). We identified NCKAP1, a gene that regulates actin filament dynamics, and CAMSAP1, a calmodulin regulate gene that regulates microtubule dynamics, as top scoring modifiers of hypertrophic cardiomyopathy phenotypes (score=9) while LDB2, RYR2, FBN1 and ATP1A2 had modifier scores of 8. Of the top scoring genes, 21 out of 96 were identified as candidates a priori. Our candidate prioritization scheme identified the previously described modifiers of cardiomyopathy phenotype, FHOD3 and MYBPC3, as top scoring genes. We identified structural variants in 21 clinically sequenced cardiomyopathy associated genes, 13 of which were at less than 10% frequency. Copy number variants in ILK and CSRP3 were nominally associated with ejection fraction (p=0.03), while 8 genes showed copy gains (GLA, FKTN, SGCD, TTN, SOS1, ANKRD1, VCL and NEBL). Structural variants were found in CSRP3, MYL3 and TNNC1, all of which have been implicated as causative for HCM. Conclusion: Evaluation of the whole genome sequence, even in the case of putatively monogenic disease, leads to important diagnostic and scientific insights not revealed by panel-based sequencing.


2018 ◽  
Vol 8 (1) ◽  
Author(s):  
Gabriel Costa Monteiro Moreira ◽  
Clarissa Boschiero ◽  
Aline Silva Mello Cesar ◽  
James M. Reecy ◽  
Thaís Fernanda Godoy ◽  
...  

2020 ◽  
Author(s):  
Benjamin I Laufer ◽  
Hyeyeon Hwang ◽  
Julia M Jianu ◽  
Charles E Mordaunt ◽  
Ian F Korf ◽  
...  

Abstract Neonatal dried blood spots (NDBS) are a widely banked sample source that enables retrospective investigation into early life molecular events. Here, we performed low-pass whole genome bisulfite sequencing (WGBS) of 86 NDBS DNA to examine early life Down syndrome (DS) DNA methylation profiles. DS represents an example of genetics shaping epigenetics, as multiple array-based studies have demonstrated that trisomy 21 is characterized by genome-wide alterations to DNA methylation. By assaying over 24 million CpG sites, thousands of genome-wide significant (q < 0.05) differentially methylated regions (DMRs) that distinguished DS from typical development and idiopathic developmental delay were identified. Machine learning feature selection refined these DMRs to 22 loci. The DS DMRs mapped to genes involved in neurodevelopment, metabolism, and transcriptional regulation. Based on comparisons with previous DS methylation studies and reference epigenomes, the hypermethylated DS DMRs were significantly (q < 0.05) enriched across tissues while the hypomethylated DS DMRs were significantly (q < 0.05) enriched for blood-specific chromatin states. A ~28 kb block of hypermethylation was observed on chromosome 21 in the RUNX1 locus, which encodes a hematopoietic transcription factor whose binding motif was the most significantly enriched (q < 0.05) overall and specifically within the hypomethylated DMRs. Finally, we also identified DMRs that distinguished DS NDBS based on the presence or absence of congenital heart disease (CHD). Together, these results not only demonstrate the utility of low-pass WGBS on NDBS samples for epigenome-wide association studies, but also provide new insights into the early life mechanisms of epigenomic dysregulation resulting from trisomy 21.


Sign in / Sign up

Export Citation Format

Share Document