scholarly journals Introducing the first whole genomes of nationals from the United Arab Emirates

2019 ◽  
Vol 9 (1) ◽  
Author(s):  
Habiba S. AlSafar ◽  
Mariam Al-Ali ◽  
Gihan Daw Elbait ◽  
Mustafa H. Al-Maini ◽  
Dymitr Ruta ◽  
...  

Abstract Whole Genome Sequencing (WGS) provides an in depth description of genome variation. In the era of large-scale population genome projects, the assembly of ethnic-specific genomes combined with mapping human reference genomes of underrepresented populations has improved the understanding of human diversity and disease associations. In this study, for the first time, whole genome sequences of two nationals of the United Arab Emirates (UAE) at >27X coverage are reported. The two Emirati individuals were predominantly of Central/South Asian ancestry. An in-house customized pipeline using BWA, Picard followed by the GATK tools to map the raw data from whole genome sequences of both individuals was used. A total of 3,994,521 variants (3,350,574 Single Nucleotide Polymorphisms (SNPs) and 643,947 indels) were identified for the first individual, the UAE S001 sample. A similar number of variants, 4,031,580 (3,373,501 SNPs and 658,079 indels), were identified for UAE S002. Variants that are associated with diabetes, hypertension, increased cholesterol levels, and obesity were also identified in these individuals. These Whole Genome Sequences has provided a starting point for constructing a UAE reference panel which will lead to improvements in the delivery of precision medicine, quality of life for affected individuals and a reduction in healthcare costs. The information compiled will likely lead to the identification of target genes that could potentially lead to the development of novel therapeutic modalities.

Insects ◽  
2020 ◽  
Vol 11 (2) ◽  
pp. 101
Author(s):  
Miao Wang ◽  
Hanyu Li ◽  
Huoqing Zheng ◽  
Liuwei Zhao ◽  
Xiaofeng Xue ◽  
...  

The invasion of Vespa velutina presents a great threat to the agriculture economy, the ecological environment, and human health. An effective strategy for this hornet control is urgently required, but the limited genome information of Vespa velutina restricts the application of molecular-genomic tools for targeted hornet management. Therefore, we conducted large-scale transcriptome profiling of the hornet brain to obtain functional target genes and molecular markers. Using an Illumina HiSeq platform, more than 41 million clean reads were obtained and de novo assembled into 182,087 meaningful unigenes. A total of 56,400 unigenes were annotated against publicly available protein sequence databases and a set of reliable Simple Sequence Repeats (SSRs) and Single Nucleotide Polymorphisms (SNP) markers were developed. The homologous genes encoding crucial behavior regulation factors, odorant binding proteins (OBPs), and vitellogenin, were also identified from highly expressed transcripts. This study provides abundant molecular targets and markers for invasive hornet control and further promotes the genetic and molecular study of Vespa velutina.


2021 ◽  
Author(s):  
Kartika Afrida Fauzia ◽  
Hafeza Aftab ◽  
Muhammad Miftahussurur ◽  
Langgeng Agung Waskito ◽  
Vo Phuoc Tuan ◽  
...  

Abstract The nucleotide polymorphisms (SNPs) associated with the biofilm formation phenotype of Helicobacter pylori were investigated. Fifty-six H. pylori isolates from Bangladeshi patients were included in this cross-sectional study. Crystal violet was used to classify the phenotypes into high- and low-biofilm formers. Whole genome sequences were analyzed using the “Antimicrobial Resistance Identification By Assembly” (ARIBA) pipeline. The results indicated 19.6% high- and 81.4% low-biofilm formers. These phenotypes were not related to specific clades in the phylogenetic analysis. Biofilm formation was significantly associated with SNPs of alpA, alpB, cagE, cgt, csd4, csd5, futB, gluP, homD, and murF (P < 0.05). Among the SNPs reported in alpB, strains encoding the N156K, G160S, and A223V mutations were high-biofilm formers. Mutations associated with antibiotic resistance can be detected. This study revealed the potential role of SNPs to biofilm formation, and propose a method to detect mutation in antibiotic resistance and biofilm from whole genome sequences.


2020 ◽  
Author(s):  
George Hindy ◽  
Peter Dornbos ◽  
Mark D. Chaffin ◽  
Dajiang J. Liu ◽  
Minxian Wang ◽  
...  

SummaryLarge-scale gene sequencing studies for complex traits have the potential to identify causal genes with therapeutic implications. We performed gene-based association testing of blood lipid levels with rare (minor allele frequency<1%) predicted damaging coding variation using sequence data from >170,000 individuals from multiple ancestries: 97,493 European, 30,025 South Asian, 16,507 African, 16,440 Hispanic/Latino, 10,420 East Asian, and 1,182 Samoan. We identified 35 genes associated with circulating lipid levels. Ten of these: ALB, SRSF2, JAK2, CREB3L3, TMEM136, VARS, NR1H3, PLA2G12A, PPARG and STAB1 have not been implicated for lipid levels using rare coding variation in population-based samples. We prioritize 32 genes identified in array-based genome-wide association study (GWAS) loci based on gene-based associations, of which three: EVI5, SH2B3, and PLIN1, had no prior evidence of rare coding variant associations. Most of the associated genes showed evidence of association in multiple ancestries. Also, we observed an enrichment of gene-based associations for low-density lipoprotein cholesterol drug target genes, and for genes closest to GWAS index single nucleotide polymorphisms (SNP). Our results demonstrate that gene-based associations can be beneficial for drug target development and provide evidence that the gene closest to the array-based GWAS index SNP is often the functional gene for blood lipid levels.


2020 ◽  
Vol 8 (11) ◽  
pp. 1673
Author(s):  
Yuying Fan ◽  
Yue Wang ◽  
Jianping Xu

Amphotericin B (AMB) is a major fungicidal polyene agent that has a broad spectrum of action against invasive fungal infections. AMB is typically used as the last-line drug against serious and life-threatening infections when other drugs have failed to eliminate the fungal pathogens. Recently, AMB resistance in Aspergillus fumigatus has become more evident. For example, a high rate of AMB resistance (96%) was noted in the A. fumigatus population in Hamilton, Ontario, Canada. AMB-resistant strains have also been found in other countries. However, the mechanism of AMB resistance remains largely unknown. Here, we investigated the potential genes and mutations associated with AMB resistance using whole-genome sequences and examined AMB resistance distribution among genetic populations. A total of 196 whole-genome sequences representing strains from 11 countries were examined. Analyses of single nucleotide polymorphisms (SNPs) at the whole-genome level revealed that these strains belonged to three divergent genetic clusters, with the majority (90%) of AMB resistant strains located in one of the three clusters, Cluster 2. Our analyses identified over 60 SNPs significantly associated with AMB resistance. Together, these SNPs represent promising candidates from which to investigate the putative molecular mechanisms of AMB resistance and for their potential use in developing rapid diagnostic markers for clinical screening of AMB resistance in A. fumigatus.


2021 ◽  
Vol 9 (2) ◽  
pp. 246
Author(s):  
Gregor Fiedler ◽  
Anna-Delia Herbstmann ◽  
Etienne Doll ◽  
Mareike Wenning ◽  
Erik Brinks ◽  
...  

The genetic heterogeneity of Heyndrickxia sporothermodurans (formerly Bacillussporothermodurans) was evaluated using whole genome sequencing. The genomes of 29 previously identified Heyndrickxiasporothermodurans and two Heyndrickxia vini strains isolated from ultra-high-temperature (UHT)-treated milk were sequenced by short-read (Illumina) sequencing. After sequence analysis, the two H. vini strains could be reclassified as H. sporothermodurans. In addition, the genomes of the H.sporothermodurans type strain (DSM 10599T) and the closest phylogenetic neighbors Heyndrickxiaoleronia (DSM 9356T) and Heyndrickxia vini (JCM 19841T) were also sequenced using both long (MinION) and short-read (Illumina) sequencing. By hybrid sequence assembly, the genome of the H. sporothermodurans type strain was enlarged by 15% relative to the short-read assembly. This noticeable increase was probably due to numerous mobile elements in the genome that are presumptively related to spore heat tolerance. Phylogenetic studies based on 16S rDNA gene sequence, core genome, single-nucleotide polymorphisms and ANI/dDDH, showed that H. vini is highly related to H. sporothermodurans. When examining the genome sequences of all H.sporothermodurans strains from this study, together with 4 H. sporothermodurans genomes available in the GenBank database, the majority of the 36 strains examined occurred in a clonal lineage with less than 100 SNPs. These data substantiate previous reports on the existence and spread of a genetically highly homogenous and heat resistant spore clone, i.e., the HRS-clone.


PLoS ONE ◽  
2021 ◽  
Vol 16 (6) ◽  
pp. e0253387
Author(s):  
Dan Jin ◽  
Philippe Henry ◽  
Jacqueline Shan ◽  
Jie Chen

The cannabis community typically uses the terms “Sativa” and “Indica” to characterize drug strains with high tetrahydrocannabinol (THC) levels. Due to large scale, extensive, and unrecorded hybridization in the past 40 years, this vernacular naming convention has become unreliable and inadequate for identifying or selecting strains for clinical research and medicinal production. Additionally, cannabidiol (CBD) dominant strains and balanced strains (or intermediate strains, which have intermediate levels of THC and CBD), are not included in the current classification studies despite the increasing research interest in the therapeutic potential of CBD. This paper is the first in a series of studies proposing that a new classification system be established based on genome-wide variation and supplemented by data on secondary metabolites and morphological characteristics. This study performed a whole-genome sequencing of 23 cannabis strains marketed in Canada, aligned sequences to a reference genome, and, after filtering for minor allele frequency of 10%, identified 137,858 single nucleotide polymorphisms (SNPs). Discriminant analysis of principal components (DAPC) was applied to these SNPs and further identified 344 structural SNPs, which classified individual strains into five chemotype-aligned groups: one CBD dominant, one balanced, and three THC dominant clusters. These structural SNPs were all multiallelic and were predominantly tri-allelic (339/344). The largest portion of these SNPs (37%) occurred on the same chromosome containing genes for CBD acid synthases (CBDAS) and THC acid synthases (THCAS). The remainder (63%) were located on the other nine chromosomes. These results showed that the genetic differences between modern cannabis strains were at a whole-genome level and not limited to THC or CBD production. These SNPs contained enough genetic variation for classifying individual strains into corresponding chemotypes. In an effort to elucidate the confused genetic backgrounds of commercially available cannabis strains, this classification attempt investigated the utility of DAPC for classifying modern cannabis strains and for identifying structural SNPs.


2018 ◽  
Author(s):  
Gregg W.C. Thomas ◽  
Elias Dohmen ◽  
Daniel S.T. Hughes ◽  
Shwetha C. Murali ◽  
Monica Poelchau ◽  
...  

AbstractBackgroundArthropods comprise the largest and most diverse phylum on Earth and play vital roles in nearly every ecosystem. Their diversity stems in part from variations on a conserved body plan, resulting from and recorded in adaptive changes in the genome. Dissection of the genomic record of sequence change enables broad questions regarding genome evolution to be addressed, even across hyper-diverse taxa within arthropods.ResultsUsing 76 whole genome sequences representing 21 orders spanning more than 500 million years of arthropod evolution, we document changes in gene and protein domain content and provide temporal and phylogenetic context for interpreting these innovations. We identify many novel gene families that arose early in the evolution of arthropods and during the diversification of insects into modern orders. We reveal unexpected variation in patterns of DNA methylation across arthropods and examples of gene family and protein domain evolution coincident with the appearance of notable phenotypic and physiological adaptations such as flight, metamorphosis, sociality and chemoperception.ConclusionsThese analyses demonstrate how large-scale comparative genomics can provide broad new insights into the genotype to phenotype map and generate testable hypotheses about the evolution of animal diversity.


2016 ◽  
Vol 38 (1) ◽  
pp. 57-59 ◽  
Author(s):  
M Hashemi ◽  
S Sanaei ◽  
M Rezaei ◽  
G Bahari ◽  
S M Hashemi ◽  
...  

Aim: MicroRNAs (miRNAs) are small noncoding RNAs that function as oncogene or tumor suppressors. The single nucleotide polymorphisms in miRNAs potentially can alter miRNA-binding sites on target genes as well as affecting miRNAs expression. The present study aimed to evaluate the impact of miR-608 rs4919510 C>G variant on breast cancer (BC) risk. Materials and Me thods: This case-control study conducted on 160 women with BC and 192 age-matched healthy women. Genotyping of miR608 rs4919510 was done using polymerase chain reaction-restriction fragment length polymorphism (PCR-RFLP) method. Results: Our findings showed that GC genotype significantly decreased the risk of BC (odds ratio (OR) = 0.49, 95% confidence interval (CI) 0.28–0.88, p = 0.018) compared to CC genotype. Furthermore the G allele decreased the risk of BC (OR = 0.53, 95%CI 0.30–0.92, p = 0.024). No significant association was found between miR-609 genotypes and clinicopathological characteristics of BC patients (p > 0.05). Conclusion: Our findings indicate that miR-608 polymorphism might be associated with decreased risk of BC in an Iranian subpopulation. Further large-scale studies with different ethnicities are needed to verify our findings.


2019 ◽  
Author(s):  
Johannes Geibel ◽  
Christian Reimer ◽  
Steffen Weigend ◽  
Annett Weigend ◽  
Torsten Pook ◽  
...  

AbstractSingle nucleotide polymorphisms (SNPs), genotyped with SNP arrays, have become a widely used marker type in population genetic analyses over the last 10 years. However, compared to whole genome re-sequencing data, arrays are known to lack a substantial proportion of globally rare variants and tend to be biased towards variants present in populations involved in the development process of the respective array. This affects population genetic estimators and is known as SNP ascertainment bias. We investigated factors contributing to ascertainment bias in array development by redesigning the Axiom™ Genome-Wide Chicken Array in silico and evaluating changes in allele frequency spectra and heterozygosity estimates in a stepwise manner. A sequential reduction of rare alleles during the development process was shown with main influencing factors being the identification of SNPs in a limited set of populations and a within-population selection of common SNPs when aiming for equidistant spacing. These effects were shown to be less severe with a larger discovery panel. Additionally, a generally massive overestimation of expected heterozygosity for the ascertained SNP sets was shown. This overestimation was 24% higher for populations involved in the discovery process than not involved populations in case of the original array. The same was observed after the SNP discovery step in the redesign. However, an unequal contribution of populations during the SNP selection can mask this effect but also adds uncertainty. Finally, we make suggestions for the design of specialized arrays for large scale projects where whole genome re-sequencing techniques are still too expensive.


Sign in / Sign up

Export Citation Format

Share Document