scholarly journals A Population-Specific Major Allele Reference Genome From The United Arab Emirates Population

2021 ◽  
Vol 12 ◽  
Author(s):  
Gihan Daw Elbait ◽  
Andreas Henschel ◽  
Guan K. Tay ◽  
Habiba S. Al Safar

The ethnic composition of the population of a country contributes to the uniqueness of each national DNA sequencing project and, ideally, individual reference genomes are required to reduce the confounding nature of ethnic bias. This work represents a representative Whole Genome Sequencing effort of an understudied population. Specifically, high coverage consensus sequences from 120 whole genomes and 33 whole exomes were used to construct the first ever population specific major allele reference genome for the United Arab Emirates (UAE). When this was applied and compared to the archetype hg19 reference, assembly of local Emirati genomes was reduced by ∼19% (i.e., some 1 million fewer calls). In compiling the United Arab Emirates Reference Genome (UAERG), sets of annotated 23,038,090 short (novel: 1,790,171) and 137,713 structural (novel: 8,462) variants; their allele frequencies (AFs) and distribution across the genome were identified. Population-specific genetic characteristics including loss-of-function variants, admixture, and ancestral haplogroup distribution were identified and reported here. We also detect a strong correlation between FST and admixture components in the UAE. This baseline study was conceived to establish a high-quality reference genome and a genetic variations resource to enable the development of regional population specific initiatives and thus inform the application of population studies and precision medicine in the UAE.

Database ◽  
2020 ◽  
Vol 2020 ◽  
Author(s):  
Cong Tan ◽  
Brett Chapman ◽  
Penghao Wang ◽  
Qisen Zhang ◽  
Gaofeng Zhou ◽  
...  

Abstract Barley (Hordeum vulgare L.) is one of the first domesticated grain crops and represents the fourth most important cereal source for human and animal consumption. BarleyVarDB is a database of barley genomic variation. It can be publicly accessible through the website at http://146.118.64.11/BarleyVar. This database mainly provides three sets of information. First, there are 57 754 224 single nuclear polymorphisms (SNPs) and 3 600 663 insertions or deletions (InDels) included in BarleyVarDB, which were identified from high-coverage whole genome sequencing of 21 barley germplasm, including 8 wild barley accessions from 3 barley evolutionary original centers and 13 barley landraces from different continents. Second, it uses the latest barley genome reference and its annotation information publicly accessible, which has been achieved by the International Barley Genome Sequencing Consortium (IBSC). Third, 522 212 whole genome-wide microsatellites/simple sequence repeats (SSRs) were also included in this database, which were identified in the reference barley pseudo-molecular genome sequence. Additionally, several useful web-based applications are provided including JBrowse, BLAST and Primer3. Users can design PCR primers to asses polymorphic variants deposited in this database and use a user-friendly interface for accessing the barley reference genome. We envisage that the BarleyVarDB will benefit the barley genetic research community by providing access to all publicly available barley genomic variation information and barley reference genome as well as providing them with an ultra-high density of SNP and InDel markers for molecular breeding and identification of functional genes with important agronomic traits in barley. Database URL: http://146.118.64.11/BarleyVar


2018 ◽  
Author(s):  
Edwin A. Solares ◽  
Mahul Chakraborty ◽  
Danny E. Miller ◽  
Shannon Kalsow ◽  
Kate Hall ◽  
...  

ABSTRACTAccurate and comprehensive characterization of genetic variation is essential for deciphering the genetic basis of diseases and other phenotypes. A vast amount of genetic variation stems from large-scale sequence changes arising from the duplication, deletion, inversion, and translocation of sequences. In the past 10 years, high-throughput short reads have greatly expanded our ability to assay sequence variation due to single nucleotide polymorphisms. However, a recent de novo assembly of a second Drosophila melanogaster reference genome has revealed that short read genotyping methods miss hundreds of structural variants, including those affecting phenotypes. While genomes assembled using high-coverage long reads can achieve high levels of contiguity and completeness, concerns about cost, errors, and low yield have limited widespread adoption of such sequencing approaches. Here we resequenced the reference strain of D. melanogaster (ISO1) on a single Oxford Nanopore MinION flow cell run for 24 hours. Using only reads longer than 1 kb or with at least 30x coverage, we assembled a highly contiguous de novo genome. The addition of inexpensive paired reads and subsequent scaffolding using an optical map technology achieved an assembly with completeness and contiguity comparable to the D. melanogaster reference assembly. Comparison of our assembly to the reference assembly of ISO1 uncovered a number of structural variants (SVs), including novel LTR transposable element insertions and duplications affecting genes with developmental, behavioral, and metabolic functions. Collectively, these SVs provide a snapshot of the dynamics of genome evolution. Furthermore, our assembly and comparison to the D. melanogaster reference genome demonstrates that high-quality de novo assembly of reference genomes and comprehensive variant discovery using such assemblies are now possible by a single lab for under $1,000 (USD).


2016 ◽  
Author(s):  
Zhikai Liang ◽  
James C Schnable

B73 is a variety of maize (Zea mays ssp. mays) widely used in genetic, genomic, and phenotypic research around the world. B73 was also served as the reference genotype for the original maize genome sequencing project. The advent of large-scale RNA-sequencing as a method of measuring gene expression presents a unique opportunity to assess the level of relatedness among individuals identified as variety B73. The level of haplotype conservation and divergence across the genome were assessed using 27 RNA-seq data sets from 20 independent research groups in three countries. Several clearly distinct clades were identified among putatively B73 samples. A number of these blocks were defined by the presence of clearly defined genomic blocks containing a haplotype which did not match the published B73 reference genome. In a number of cases the relationship among B73 samples generated by different research groups recapitulated mentor/mentee relationships within the maize genetics community. A number of regions with distinct, dissimilar, haplotypes were identified in our study. However, when considering the age of the B73 accession -- greater than 40 years -- and the challenges of maintaining isogenic lines of a naturally outcrossing species, a strikingly high overall level of conservation was exhibited among B73 samples from around the globe.


2019 ◽  
Author(s):  
Karen H. Miga ◽  
Sergey Koren ◽  
Arang Rhie ◽  
Mitchell R. Vollger ◽  
Ariel Gershman ◽  
...  

After nearly two decades of improvements, the current human reference genome (GRCh38) is the most accurate and complete vertebrate genome ever produced. However, no one chromosome has been finished end to end, and hundreds of unresolved gaps persist 1,2. The remaining gaps include ribosomal rDNA arrays, large near-identical segmental duplications, and satellite DNA arrays. These regions harbor largely unexplored variation of unknown consequence, and their absence from the current reference genome can lead to experimental artifacts and hide true variants when re-sequencing additional human genomes. Here we present a de novo human genome assembly that surpasses the continuity of GRCh38 2, along with the first gapless, telomere-to-telomere assembly of a human chromosome. This was enabled by high-coverage, ultra-long-read nanopore sequencing of the complete hydatidiform mole CHM13 genome, combined with complementary technologies for quality improvement and validation. Focusing our efforts on the human X chromosome 3, we reconstructed the ∼2.8 megabase centromeric satellite DNA array and closed all 29 remaining gaps in the current reference, including new sequence from the human pseudoautosomal regions and cancer-testis ampliconic gene families (CT-X and GAGE). This complete chromosome X, combined with the ultra-long nanopore data, also allowed us to map methylation patterns across complex tandem repeats and satellite arrays for the first time. These results demonstrate that finishing the human genome is now within reach and will enable ongoing efforts to complete the remaining human chromosomes.


Neurology ◽  
2017 ◽  
Vol 89 (10) ◽  
pp. 1043-1049 ◽  
Author(s):  
Ilaria Giordano ◽  
Florian Harmuth ◽  
Heike Jacobi ◽  
Brigitte Paap ◽  
Stefan Vielhaber ◽  
...  

Objective:To define the clinical phenotype and natural history of sporadic adult-onset degenerative ataxia and to identify putative disease-causing mutations.Methods:The primary measure of disease severity was the Scale for the Assessment and Rating of Ataxia (SARA). DNA samples were screened for mutations using a high-coverage ataxia-specific gene panel in combination with next-generation sequencing.Results:The analysis was performed on 249 participants. Among them, 83 met diagnostic criteria of clinically probable multiple system atrophy cerebellar type (MSA-C) at baseline and another 12 during follow-up. Positive MSA-C criteria (4.94 ± 0.74, p < 0.0001) and disease duration (0.22 ± 0.06 per additional year, p = 0.0007) were associated with a higher SARA score. Forty-eight participants who did not fulfill MSA-C criteria and had a disease duration of >10 years were designated sporadic adult-onset ataxia of unknown etiology/non-MSA (SAOA/non-MSA). Compared with MSA-C, SAOA/non-MSA patients had lower SARA scores (13.6 ± 6.0 vs 16.0 ± 5.8, p = 0.0200) and a slower annual SARA increase (1.1 ± 2.3 vs 3.3 ± 3.2, p = 0.0013). In 11 of 194 tested participants (6%), a definitive or probable genetic diagnosis was made.Conclusions:Our study provides quantitative data on the clinical phenotype and progression of sporadic ataxia with adult onset. Screening for causative mutations with a gene panel approach yielded a genetic diagnosis in 6% of the cohort.ClinicalTrials.gov registration:NCT02701036.


2020 ◽  
Author(s):  
Basil M Fathalla ◽  
Ali Alsarhan ◽  
Samina Afzal ◽  
Maha EL Naofal ◽  
Ahmad Abou Tayoun

AbstractGenetic investigations for patients with pediatric rheumatological disorders have been limited to classic genotyping testing, mainly MEFV hotspot mutation analysis, for periodic fever. Therefore, the landscape and clinical utility of comprehensive genomic investigations for a wider range of pediatric rheumatological disorders have not been fully characterized in the Middle East. Here seventy-one pediatric patients, of diverse Arab origins, were clinically and genetically assessed for a spectrum of rheumatology-related disease at the only dedicated tertiary children’s hospital in the United Arab Emirates. Clinical genomic investigations included mainly (76%) next generation sequencing-based gene panels and whole exome sequencing, along with rapid sequencing in the intensive care unit (ICU) and urgent setting. The overall positive yield was 46.5% (16.7%-66.7% for specific indications), while dual diagnoses were made in 2 cases (3%). Although the majority (21/33, 64%) of positive findings involved the MEFV gene, the remaining (12/33, 36%) alterations were attributed to eleven other genes/loci. Copy number variants contributed substantially (5/33, 15.2%) to the overall diagnostic yield. Sequencing-based testing, specifically rapid sequencing, had high positive rate and delivered timely results. Genetic findings guided clinical management plans and interventions in most cases (27/33, 81.8%). We highlight unique findings and provide additional evidence that heterozygous loss of function of the IFIH1 gene increases susceptibility to recurrent fevers. Our study highlights the importance of comprehensive genomic investigations in patients with pediatric rheumatological disorders, and provides new insights into the pathogenic variation landscape in this group of disorders.


2019 ◽  
Author(s):  
Gianpiero Marconi ◽  
Stefano Capomaccio ◽  
Cinzia Comino ◽  
Alberto Acquadro ◽  
Ezio Portis ◽  
...  

AbstractMethods for investigating DNA methylation nowadays either require a reference genome and high coverage, or investigate only CG methylation. Moreover, no large-scale analysis can be performed for N6-methyladenosine (6mA). Here we describe the methylation content sensitive enzyme double-digest restriction-site-associated DNA (ddRAD) technique (MCSeEd), a reduced-representation, reference-free, cost-effective approach for characterizing whole genome methylation patterns across different methylation contexts (e.g., CG, CHG, CHH, 6mA). MCSeEd can also detect genetic variations among hundreds of samples. MCSeEd is based on parallel restrictions carried out by combinations of methylation insensitive and sensitive endonucleases, followed by next-generation sequencing. Moreover, we present a robust bioinformatic pipeline (available at https://bitbucket.org/capemaster/mcseed/src/master/) for differential methylation analysis combined with single nucleotide polymorphism calling without or with a reference genome.


2021 ◽  
pp. jclinpath-2021-207426
Author(s):  
Jonathan Samuel Fenn ◽  
Nathan Lorde ◽  
John Martin Ward ◽  
Ingrid Borovickova

Hypophosphatasia (HPP) is a group of inherited disorders characterised by the impaired mineralisation of bones and/or teeth and low serum alkaline phosphatase (ALP) activity. It is caused by a mutation in the ALPL gene encoding the tissue-non-specific isoenzyme of ALP (TNSALP) resulting in a loss of function. The disease is highly heterogenous in its clinical expression ranging from stillbirth without mineralised bone to the mild form of late adult onset with symptoms and signs such as musculoskeletal pain, arthropathy, lower-extremity fractures, premature loss of teeth or an incidental finding of reduced serum ALP activity. A classification based on the age at diagnosis and the presence or absence of bone symptoms was historically used: perinatal, prenatal benign, infantile, childhood, adult and odontohypophosphatasia. These subtypes are known to have overlapping signs and complications. Three forms of HPP distinguishable by their genetic characteristics have been described: severe, moderate and mild. Severe forms of HPP (perinatal and infantile severe) are recessively inherited, whereas moderate HPP may be dominantly or recessively inherited. The biochemical hallmark of HPP is persistently low serum ALP for age and increase in natural substrates of TNSALP, pyridoxal 5′-phosphate and phosphoethanolamine supported by radiological findings. The diagnosis is confirmed by ALPL sequencing. A multidisciplinary team of experts is essential for the effective management. Calcium restriction is recommended in infants/children to manage hypercalcaemia. A targeted enzyme replacement therapy for HPP has become available and correct diagnosis is crucial to allow early treatment.


2015 ◽  
Vol 170 (3) ◽  
pp. 602-609 ◽  
Author(s):  
Asma Deeb ◽  
Abdelhadi Habeb ◽  
Walid Kaplan ◽  
Salima Attia ◽  
Suha Hadi ◽  
...  

Sign in / Sign up

Export Citation Format

Share Document