genome variation
Recently Published Documents


TOTAL DOCUMENTS

241
(FIVE YEARS 41)

H-INDEX

38
(FIVE YEARS 5)

Author(s):  
Ying Li ◽  
Decheng Wang ◽  
Jingjing Zhang ◽  
Peiqi Huang ◽  
Hui Du ◽  
...  

Human adenoviruses (HAdVs) type 7 can cause severe respiratory disease. During the period between December 2018 and August 2019, HAdV-7 infection was identified in 129 patients in Wuhan Children’s Hospital, Hubei Province, China. Samples were collected from hospitalized children and metagenomic sequencing was applied to detect the HAdV infections. Hemophagocytic lymphohistiocystosis (HLH) related to HAdV infections was observed in some patients clinically and patients were divided into two groups based on this to test the differences among clinical indicators. Genome variation, in silico restriction endonuclease analysis (REA), and phylogenetic analyses were carried out to show the genome characterization of HAdV-7 in this study. It was found that many indicators, such as all blood routine indicators, in patients of the HLH group showed significant levels. In this study, REA revealed that HAdV-7 might belong to genome 7d and genome variation analysis displayed the stable genome of HAdV. HAdV-7 is an ongoing threat to the public, and global surveillance should be established.


2021 ◽  
Vol 6 ◽  
pp. 42
Author(s):  
◽  
Ambroise Ahouidi ◽  
Mozam Ali ◽  
Jacob Almagro-Garcia ◽  
Alfred Amambua-Ngwa ◽  
...  

MalariaGEN is a data-sharing network that enables groups around the world to work together on the genomic epidemiology of malaria. Here we describe a new release of curated genome variation data on 7,000 Plasmodium falciparum samples from MalariaGEN partner studies in 28 malaria-endemic countries. High-quality genotype calls on 3 million single nucleotide polymorphisms (SNPs) and short indels were produced using a standardised analysis pipeline. Copy number variants associated with drug resistance and structural variants that cause failure of rapid diagnostic tests were also analysed.  Almost all samples showed genetic evidence of resistance to at least one antimalarial drug, and some samples from Southeast Asia carried markers of resistance to six commonly-used drugs. Genes expressed during the mosquito stage of the parasite life-cycle are prominent among loci that show strong geographic differentiation. By continuing to enlarge this open data resource we aim to facilitate research into the evolutionary processes affecting malaria control and to accelerate development of the surveillance toolkit required for malaria elimination.


2021 ◽  
Vol 12 ◽  
Author(s):  
Katherine Chacón-Vargas ◽  
Colin O. McCarthy ◽  
Dasol Choi ◽  
Long Wang ◽  
Jae-Hyuk Yu ◽  
...  

Microbes (bacteria, yeasts, molds), in addition to plants and animals, were domesticated for their roles in food preservation, nutrition and flavor. Aspergillus oryzae is a domesticated filamentous fungal species traditionally used during fermentation of Asian foods and beverage, such as sake, soy sauce, and miso. To date, little is known about the extent of genome and phenotypic variation of A. oryzae isolates from different clades. Here, we used long-read Oxford Nanopore and short-read Illumina sequencing to produce a highly accurate and contiguous genome assemble of A. oryzae 14160, an industrial strain from China. To understand the relationship of this isolate, we performed phylogenetic analysis with 90 A. oryzae isolates and 1 isolate of the A. oryzae progenitor, Aspergillus flavus. This analysis showed that A. oryzae 14160 is a member of clade A, in comparison to the RIB 40 type strain, which is a member of clade F. To explore genome variation between isolates from distinct A. oryzae clades, we compared the A. oryzae 14160 genome with the complete RIB 40 genome. Our results provide evidence of independent evolution of the alpha-amylase gene duplication, which is one of the major adaptive mutations resulting from domestication. Synteny analysis revealed that both genomes have three copies of the alpha-amylase gene, but only one copy on chromosome 2 was conserved. While the RIB 40 genome had additional copies of the alpha-amylase gene on chromosomes III, and V, 14160 had a second copy on chromosome II and an third copy on chromosome VI. Additionally, we identified hundreds of lineage specific genes, and putative high impact mutations in genes involved in secondary metabolism, including several of the core biosynthetic genes. Finally, to examine the functional effects of genome variation between strains, we measured amylase activity, proteolytic activity, and growth rate on several different substrates. RIB 40 produced significantly higher levels of amylase compared to 14160 when grown on rice and starch. Accordingly, RIB 40 grew faster on rice, while 14160 grew faster on soy. Taken together, our analyses reveal substantial genome and phenotypic variation within A. oryzae.


2021 ◽  
Author(s):  
Ali Rahnavard ◽  
Rebecca Clement ◽  
Nathaniel Stearrett ◽  
Marcos Pérez-Losada ◽  
Keith A. Crandall ◽  
...  

Abstract The 2019 novel coronavirus (SARS-CoV-2) is the etiological agent of the COVID-19 pandemic and evolves to evade both host immune systems and intervention strategies. To diminish the short-term and long-term impacts of coronavirus (CoV), we investigated CoV differences at the nucleotide and protein level and CoV genomic variation associated with epidemiological variation and geography. We divided the CoV genome into 29 constituent regions for this analysis. Our results highlight the variation of CoV variants of lineage and show that nonstructural protein 3 (nsp3) and Spike protein (S) have the highest variation and greatest correlation with the viral whole-genome variation, which makes these two proteins potential targets for treatments. S protein variation is highly correlated with nsp3, nsp6, and 3'−to−5' exonuclease. Country of origin and time since the start of the pandemic were the most influential metadata in these differences. Host sex and age are the lowest in terms of explaining the virus genome variation. We quantified variation explained by regions of the CoV genome across different CoV viruses including, SARS-CoV-2, Middle East respiratory syndrome coronavirus (MERS-CoV), other severe acute respiratory syndrome coronavirus SARS-CoV (SARS-related), and bat-derived severe acute respiratory syndrome (SARS)-like coronaviruses (Bat-SL-CoV). We found that Spike protein and nsp3 explain most of the variation among these viruses; they are also among the genomic regions with the highest number of sites under natural selection. Our results provide a direction to prioritize genes associated with outcome predictors, including health, therapeutic, and vaccine outcomes, and to inform improved DNA tests for predicting disease status.


2021 ◽  
Vol 12 (2) ◽  
pp. 507-515
Author(s):  
Hui Yi ◽  
Zhi-Wei Liao ◽  
Jun-Jie Chen ◽  
Xin-Yu Shi ◽  
Guo-Liang Chen ◽  
...  

2021 ◽  
Vol 6 ◽  
pp. 42
Author(s):  
◽  
Ambroise Ahouidi ◽  
Mozam Ali ◽  
Jacob Almagro-Garcia ◽  
Alfred Amambua-Ngwa ◽  
...  

MalariaGEN is a data-sharing network that enables groups around the world to work together on the genomic epidemiology of malaria. Here we describe a new release of curated genome variation data on 7,000 Plasmodium falciparum samples from MalariaGEN partner studies in 28 malaria-endemic countries. High-quality genotype calls on 3 million single nucleotide polymorphisms (SNPs) and short indels were produced using a standardised analysis pipeline. Copy number variants associated with drug resistance and structural variants that cause failure of rapid diagnostic tests were also analysed.  Almost all samples showed genetic evidence of resistance to at least one antimalarial drug, and some samples from Southeast Asia carried markers of resistance to six commonly-used drugs. Genes expressed during the mosquito stage of the parasite life-cycle are prominent among loci that show strong geographic differentiation. By continuing to enlarge this open data resource we aim to facilitate research into the evolutionary processes affecting malaria control and to accelerate development of the surveillance toolkit required for malaria elimination.


2021 ◽  
Vol 12 (1) ◽  
Author(s):  
Yusuke Oizumi ◽  
Takuto Kaji ◽  
Sanki Tashiro ◽  
Yumiko Takeshita ◽  
Yuko Date ◽  
...  

AbstractGenome sequences have been determined for many model organisms; however, repetitive regions such as centromeres, telomeres, and subtelomeres have not yet been sequenced completely. Here, we report the complete sequences of subtelomeric homologous (SH) regions of the fission yeast Schizosaccharomyces pombe. We overcame technical difficulties to obtain subtelomeric repetitive sequences by constructing strains that possess single SH regions of a standard laboratory strain. In addition, some natural isolates of S. pombe were analyzed using previous sequencing data. Whole sequences of SH regions revealed that each SH region consists of two distinct parts with mosaics of multiple common segments or blocks showing high variation among subtelomeres and strains. Subtelomere regions show relatively high frequency of nucleotide variations among strains compared with the other chromosomal regions. Furthermore, we identified subtelomeric RecQ-type helicase genes, tlh3 and tlh4, which add to the already known tlh1 and tlh2, and found that the tlh1–4 genes show high sequence variation with missense mutations, insertions, and deletions but no severe effects on their RNA expression. Our results indicate that SH sequences are highly polymorphic and hot spots for genome variation. These features of subtelomeres may have contributed to genome diversity and, conversely, various diseases.


Author(s):  
Marc Pauper ◽  
Erdi Kucuk ◽  
Aaron M. Wenger ◽  
Shreyasee Chakraborty ◽  
Primo Baybayan ◽  
...  

AbstractLong-read sequencing (LRS) has the potential to comprehensively identify all medically relevant genome variation, including variation commonly missed by short-read sequencing (SRS) approaches. To determine this potential, we performed LRS around 15×–40× genome coverage using the Pacific Biosciences Sequel I System for five trios. The respective probands were diagnosed with intellectual disability (ID) whose etiology remained unresolved after SRS exomes and genomes. Systematic assessment of LRS coverage showed that ~35 Mb of the human reference genome was only accessible by LRS and not SRS. Genome-wide structural variant (SV) calling yielded on average 28,292 SV calls per individual, totaling 12.9 Mb of sequence. Trio-based analyses which allowed to study segregation, showed concordance for up to 95% of these SV calls across the genome, and 80% of the LRS SV calls were not identified by SRS. De novo mutation analysis did not identify any de novo SVs, confirming that these are rare events. Because of high sequence coverage, we were also able to call single nucleotide substitutions. On average, we identified 3 million substitutions per genome, with a Mendelian inheritance concordance of up to 97%. Of these, ~100,000 were located in the ~35 Mb of the genome that was only captured by LRS. Moreover, these variants affected the coding sequence of 64 genes, including 32 known Mendelian disease genes. Our data show the potential added value of LRS compared to SRS for identifying medically relevant genome variation.


2020 ◽  
Vol 49 (D1) ◽  
pp. D1186-D1191
Author(s):  
Cuiping Li ◽  
Dongmei Tian ◽  
Bixia Tang ◽  
Xiaonan Liu ◽  
Xufei Teng ◽  
...  

Abstract The Genome Variation Map (GVM; http://bigd.big.ac.cn/gvm/) is a public data repository of genome variations. It aims to collect and integrate genome variations for a wide range of species, accepts submissions of different variation types from all over the world and provides free open access to all publicly available data in support of worldwide research activities. Compared with the previous version, particularly, a total of 22 species, 115 projects, 55 935 samples, 463 429 609 variants, 66 220 associations and 56 submissions (as of 7 September 2020) were newly added in the current version of GVM. In the current release, GVM houses a total of ∼960 million variants from 41 species, including 13 animals, 25 plants and 3 viruses. Moreover, it incorporates 64 819 individual genotypes and 260 393 manually curated high-quality genotype-to-phenotype associations. Since its inception, GVM has archived genomic variation data of 43 754 samples submitted by worldwide users and served >1 million data download requests. Collectively, as a core resource in the National Genomics Data Center, GVM provides valuable genome variations for a diversity of species and thus plays an important role in both functional genomics studies and molecular breeding.


Sign in / Sign up

Export Citation Format

Share Document