scholarly journals Norgal: extraction and de novo assembly of mitochondrial DNA from whole-genome sequencing data

2017 ◽  
Vol 18 (1) ◽  
Kosai Al-Nakeeb ◽  
Thomas Nordahl Petersen ◽  
Thomas Sicheritz-Pontén
2017 ◽  
Adriana Munoz ◽  
Boris Yamrom ◽  
Yoon-ha Lee ◽  
Peter Andrews ◽  
Steven Marks ◽  

AbstractCopy number profiling and whole-exome sequencing has allowed us to make remarkable progress in our understanding of the genetics of autism over the past ten years, but there are major aspects of the genetics that are unresolved. Through whole-genome sequencing, additional types of genetic variants can be observed. These variants are abundant and to know which are functional is challenging. We have analyzed whole-genome sequencing data from 510 of the Simons Simplex Collections quad families and focused our attention on intronic variants. Within the introns of 546 high-quality autism target genes, we identified 63 de novo indels in the affected and only 37 in the unaffected siblings. The difference of 26 events is significantly larger than expected (p-val = 0.01) and using reasonable extrapolation shows that de novo intronic indels can contribute to at least 10% of simplex autism. The significance increases if we restrict to the half of the autism targets that are intolerant to damaging variants in the normal human population, which half we expect to be even more enriched for autism genes. For these 273 targets we observe 43 and 20 events in affected and unaffected siblings, respectively (p-value of 0.005). There was no significant signal in the number of de novo intronic indels in any of the control sets of genes analyzed. We see no signal from de novo substitutions in the introns of target genes.

Data in Brief ◽  
2019 ◽  
Vol 27 ◽  
pp. 104680 ◽  
Kit Yinn Teh ◽  
C.L.Wan Afifudeen ◽  
Ahmad Aziz ◽  
Li Lian Wong ◽  
Saw Hong Loh ◽  

BMC Genomics ◽  
2021 ◽  
Vol 22 (1) ◽  
Marina Braun ◽  
Annika Lehmbecker ◽  
Deborah Eikelberg ◽  
Maren Hellige ◽  
Andreas Beineke ◽  

Abstract Background Bovine frontonasal dysplasias like arhinencephaly, synophthalmia, cyclopia and anophthalmia are sporadic congenital facial malformations. In this study, computed tomography, necropsy, histopathological examinations and whole genome sequencing on an Illumina NextSeq500 were performed to characterize a stillborn Limousin calf with frontonasal dysplasia. In order to identify private genetic and structural variants, we screened whole genome sequencing data of the affected calf and unaffected relatives including parents, a maternal and paternal halfsibling. Results The stillborn calf exhibited severe craniofacial malformations. Nose and maxilla were absent, mandibles were upwardly curved and a median cleft palate was evident. Eyes, optic nerve and orbital cavities were not developed and the rudimentary orbita showed hypotelorism. A defect centrally in the front skull covered with a membrane extended into the intracranial cavity. Aprosencephaly affected telencephalic and diencephalic structures and cerebellum. In addition, a shortened tail was seen. Filtering whole genome sequencing data revealed a private frameshift variant within the candidate gene ZIC2 in the affected calf. This variant was heterozygous mutant in this case and homozygous wild type in parents, half-siblings and controls. Conclusions We found a novel ZIC2 frameshift mutation in an aprosencephalic Limousin calf. The origin of this variant is most likely due to a de novo mutation in the germline of one parent or during very early embryonic development. To the authors’ best knowledge, this is the first identified mutation in cattle associated with bovine frontonasal dysplasia.

2020 ◽  
Vol 21 (1) ◽  
Jian-Jun Jin ◽  
Wen-Bin Yu ◽  
Jun-Bo Yang ◽  
Yu Song ◽  
Claude W. dePamphilis ◽  

Abstract GetOrganelle is a state-of-the-art toolkit to accurately assemble organelle genomes from whole genome sequencing data. It recruits organelle-associated reads using a modified “baiting and iterative mapping” approach, conducts de novo assembly, filters and disentangles the assembly graph, and produces all possible configurations of circular organelle genomes. For 50 published plant datasets, we are able to reassemble the circular plastomes from 47 datasets using GetOrganelle. GetOrganelle assemblies are more accurate than published and/or NOVOPlasty-reassembled plastomes as assessed by mapping. We also assemble complete mitochondrial genomes using GetOrganelle. GetOrganelle is freely released under a GPL-3 license (

Genes ◽  
2018 ◽  
Vol 9 (10) ◽  
pp. 486 ◽  
Adam Ameur ◽  
Huiwen Che ◽  
Marcel Martin ◽  
Ignas Bunikis ◽  
Johan Dahlberg ◽  

The current human reference sequence (GRCh38) is a foundation for large-scale sequencing projects. However, recent studies have suggested that GRCh38 may be incomplete and give a suboptimal representation of specific population groups. Here, we performed a de novo assembly of two Swedish genomes that revealed over 10 Mb of sequences absent from the human GRCh38 reference in each individual. Around 6 Mb of these novel sequences (NS) are shared with a Chinese personal genome. The NS are highly repetitive, have an elevated GC-content, and are primarily located in centromeric or telomeric regions. Up to 1 Mb of NS can be assigned to chromosome Y, and large segments are also missing from GRCh38 at chromosomes 14, 17, and 21. Inclusion of NS into the GRCh38 reference radically improves the alignment and variant calling from short-read whole-genome sequencing data at several genomic loci. A re-analysis of a Swedish population-scale sequencing project yields > 75,000 putative novel single nucleotide variants (SNVs) and removes > 10,000 false positive SNV calls per individual, some of which are located in protein coding regions. Our results highlight that the GRCh38 reference is not yet complete and demonstrate that personal genome assemblies from local populations can improve the analysis of short-read whole-genome sequencing data.

2020 ◽  
Evin M. Padhi ◽  
Tristan J. Hayeck ◽  
Brandon Mannion ◽  
Sumantra Chatterjee ◽  
Marta Byrska-Bishop ◽  

AbstractPrevious research in autism and other neurodevelopmental disorders (NDDs) has indicated an important contribution of de novo protein-coding variants within specific genes. The role of de novo noncoding variation has been observable as a general increase in genetic burden but has yet to be resolved to individual functional elements. In this study, we assessed whole-genome sequencing data in 2,671 families with autism, with a specific focus on de novo variation in enhancers with previously characterized in vivo activity. We identified three independent de novo mutations limited to individuals with autism in the enhancer hs737. These mutations result in similar phenotypic characteristics, affect enhancer activity in vitro, and preferentially occur in AAT motifs in the enhancer with predicted disruptions of transcription factor binding. We also find that hs737 is enriched for copy number variation in individuals with NDDs, is dosage sensitive in the human population, is brain-specific, and targets the NDD gene EBF3 that is genome-wide significant for protein coding de novo variants, demonstrating the importance of understanding all forms of variation in the genome.One Sentence SummaryWhole-genome sequencing in thousands of families reveals variants relevant to simplex autism in a brain enhancer of the well-established neurodevelopmental disorder gene EBF3.

2017 ◽  
Vol 4 (1) ◽  
Martin Malmstrøm ◽  
Michael Matschiner ◽  
Ole K. Tørresen ◽  
Kjetill S. Jakobsen ◽  
Sissel Jentoft

Sign in / Sign up

Export Citation Format

Share Document