PAN-INDIA 1000 SARS-CoV-2 RNA Genome Sequencing Reveals Important Insights into the Outbreak

AbstractThe PAN-INDIA 1000 SARS-CoV-2 RNA Genome Sequencing Consortium has achieved its initial goal of completing the sequencing of 1000 SARS-CoV-2 genomes from nasopharyngeal and oropharyngeal swabs collected from individuals testing positive for COVID-19 by Real Time PCR. The samples were collected across 10 states covering different zones within India. Given the importance of this information for public health response initiatives investigating transmission of COVID-19, the sequence data is being released in GISAID database. This information will improve our understanding on how the virus is spreading, ultimately helping to interrupt the transmission chains, prevent new cases of infection, and provide impetus to research on intervention measures. This will also provide us with information on evolution of the virus, genetic predisposition (if any) and adaptation to human hosts.One thousand and fifty two sequences were used for phylodynamic, temporal and geographic mutation patterns and haplotype network analyses. Initial results indicate that multiple lineages of SARS-CoV-2 are circulating in India, probably introduced by travel from Europe, USA and East Asia. A2a (20A/B/C) was found to be predominant, along with few parental haplotypes 19A/B. In particular, there is a predominance of the D614G mutation, which is found to be emerging in almost all regions of the country. Additionally, mutations in important regions of the viral genome with significant geographical clustering have also been observed. The temporal haplotype diversities landscape in each region appears to be similar pan India, with haplotype diversities peaking between March-May, while by June A2a (20A/B/C) emerged as the predominant one. Within haplotypes, different states appear to have different proportions. Temporal and geographic patterns in the sequences obtained reveal interesting clustering of mutations. Some mutations are present at particularly high frequencies in one state as compared to others. The negative estimate Tajimas D (D = −2.26817) is consistent with the rapid expansion of SARS-CoV-2 population in India. Detailed mutational analysis across India to understand the gradual emergence of mutants at different regions of the country and its possible implication will help in better disease management.

Download Full-text

PSI-40 Two mitochondrial lineages revealed in North American yak

Journal of Animal Science ◽

10.1093/jas/skaa278.833 ◽

2020 ◽

Vol 98 (Supplement_4) ◽

pp. 477-477

Author(s):

Leah K Treffer ◽

Edward S Rice ◽

Anna M Fuller ◽

Samuel Cutler ◽

Jessica L Petersen

Keyword(s):

Sequence Data ◽

Haplotype Network ◽

Ovis Aries ◽

Similar Species ◽

Nucleotide Polymorphisms ◽

Mt Dna ◽

Protein Coding ◽

Sister Clade ◽

Mtdna Sequence ◽

The Impact

Abstract Domestic yak (Bos grunniens) are bovids native to the Asian Qinghai-Tibetan Plateau. Studies of Asian yak have revealed that introgression with domestic cattle has contributed to the evolution of the species. When imported to North America (NA), some hybridization with B. taurus did occur. The objective of this study was to use mitochondrial (mt) DNA sequence data to better understand the mtDNA origin of NA yak and their relationship to Asian yak and related species. The complete mtDNA sequence of 14 individuals (12 NA yak, 1 Tibetan yak, 1 Tibetan B. indicus) was generated and compared with sequences of similar species from GeneBank (B. indicus, B. grunniens (Chinese), B. taurus, B. gaurus, B. primigenius, B. frontalis, Bison bison, and Ovis aries). Individuals were aligned to the B. grunniens reference genome (ARS_UNL_BGru_maternal_1.0), which was also included in the analyses. The mtDNA genes were annotated using the ARS-UCD1.2 cattle sequence as a reference. Ten unique NA yak haplotypes were identified, which a haplotype network separated into two clusters. Variation among the NA haplotypes included 93 nonsynonymous single nucleotide polymorphisms. A maximum likelihood tree including all taxa was made using IQtree after the data were partitioned into twenty-two subgroups using PartitionFinder2. Notably, six NA yak haplotypes formed a clade with B. indicus; the other four haplotypes grouped with B. grunniens and fell as a sister clade to bison, gaur and gayal. These data demonstrate two mitochondrial origins of NA yak with genetic variation in protein coding genes. Although these data suggest yak introgression with B. indicus, it appears to date prior to importation into NA. In addition to contributing to our understanding of the species history, these results suggest the two major mtDNA haplotypes in NA yak may functionally differ. Characterization of the impact of these differences on cellular function is currently underway.

Download Full-text

Whole Genome Sequencing Refines Knowledge on the Population Structure of Mycobacterium bovis from a Multi-Host Tuberculosis System

Microorganisms ◽

10.3390/microorganisms9081585 ◽

2021 ◽

Vol 9 (8) ◽

pp. 1585

Author(s):

Ana C. Reis ◽

Liliana C. M. Salvador ◽

Suelee Robbe-Austerman ◽

Rogério Tenreiro ◽

Ana Botelho ◽

...

Keyword(s):

Population Structure ◽

Whole Genome Sequencing ◽

Wild Boar ◽

Genome Sequencing ◽

Mycobacterium Bovis ◽

Red Deer ◽

Variable Number Tandem Repeat ◽

Variant Calling ◽

Whole Genome ◽

Network Analyses

Classical molecular analyses of Mycobacterium bovis based on spoligotyping and Variable Number Tandem Repeat (MIRU-VNTR) brought the first insights into the epidemiology of animal tuberculosis (TB) in Portugal, showing high genotypic diversity of circulating strains that mostly cluster within the European 2 clonal complex. Previous surveillance provided valuable information on the prevalence and spatial occurrence of TB and highlighted prevalent genotypes in areas where livestock and wild ungulates are sympatric. However, links at the wildlife–livestock interfaces were established mainly via classical genotype associations. Here, we apply whole genome sequencing (WGS) to cattle, red deer and wild boar isolates to reconstruct the M. bovis population structure in a multi-host, multi-region disease system and to explore links at a fine genomic scale between M. bovis from wildlife hosts and cattle. Whole genome sequences of 44 representative M. bovis isolates, obtained between 2003 and 2015 from three TB hotspots, were compared through single nucleotide polymorphism (SNP) variant calling analyses. Consistent with previous results combining classical genotyping with Bayesian population admixture modelling, SNP-based phylogenies support the branching of this M. bovis population into five genetic clades, three with apparent geographic specificities, as well as the establishment of an SNP catalogue specific to each clade, which may be explored in the future as phylogenetic markers. The core genome alignment of SNPs was integrated within a spatiotemporal metadata framework to further structure this M. bovis population by host species and TB hotspots, providing a baseline for network analyses in different epidemiological and disease control contexts. WGS of M. bovis isolates from Portugal is reported for the first time in this pilot study, refining the spatiotemporal context of TB at the wildlife–livestock interface and providing further support to the key role of red deer and wild boar on disease maintenance. The SNP diversity observed within this dataset supports the natural circulation of M. bovis for a long time period, as well as multiple introduction events of the pathogen in this Iberian multi-host system.

Download Full-text

High-precision and cost-efficient sequencing for real-time COVID-19 surveillance

Scientific Reports ◽

10.1038/s41598-021-93145-4 ◽

2021 ◽

Vol 11 (1) ◽

Author(s):

Sung Yong Park ◽

Gina Faraci ◽

Pamela M. Ward ◽

Jane F. Emerson ◽

Ha Youn Lee

Keyword(s):

Los Angeles ◽

Whole Genome Sequencing ◽

Real Time ◽

Genome Sequencing ◽

High Precision ◽

High Throughput Sequencing ◽

Whole Genome ◽

Sequencing Data ◽

Public Health Response ◽

Cost Efficient

AbstractCOVID-19 global cases have climbed to more than 33 million, with over a million total deaths, as of September, 2020. Real-time massive SARS-CoV-2 whole genome sequencing is key to tracking chains of transmission and estimating the origin of disease outbreaks. Yet no methods have simultaneously achieved high precision, simple workflow, and low cost. We developed a high-precision, cost-efficient SARS-CoV-2 whole genome sequencing platform for COVID-19 genomic surveillance, CorvGenSurv (Coronavirus Genomic Surveillance). CorvGenSurv directly amplified viral RNA from COVID-19 patients’ Nasopharyngeal/Oropharyngeal (NP/OP) swab specimens and sequenced the SARS-CoV-2 whole genome in three segments by long-read, high-throughput sequencing. Sequencing of the whole genome in three segments significantly reduced sequencing data waste, thereby preventing dropouts in genome coverage. We validated the precision of our pipeline by both control genomic RNA sequencing and Sanger sequencing. We produced near full-length whole genome sequences from individuals who were COVID-19 test positive during April to June 2020 in Los Angeles County, California, USA. These sequences were highly diverse in the G clade with nine novel amino acid mutations including NSP12-M755I and ORF8-V117F. With its readily adaptable design, CorvGenSurv grants wide access to genomic surveillance, permitting immediate public health response to sudden threats.

Download Full-text

Functional alterations caused by mutations reflect evolutionary trends of SARS-CoV-2

Briefings in Bioinformatics ◽

10.1093/bib/bbab042 ◽

2021 ◽

Author(s):

Liang Cheng ◽

Xudong Han ◽

Zijun Zhu ◽

Changlu Qi ◽

Ping Wang ◽

...

Keyword(s):

Reference Genome ◽

Sequence Data ◽

Purifying Selection ◽

Virus Genome ◽

Receptor Binding Domain ◽

Evolutionary Trends ◽

Synonymous Mutations ◽

Almost All ◽

Virus Strains ◽

New Mutations

Abstract Since the first report of severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) in December 2019, the COVID-19 pandemic has spread rapidly worldwide. Due to the limited virus strains, few key mutations that would be very important with the evolutionary trends of virus genome were observed in early studies. Here, we downloaded 1809 sequence data of SARS-CoV-2 strains from GISAID before April 2020 to identify mutations and functional alterations caused by these mutations. Totally, we identified 1017 nonsynonymous and 512 synonymous mutations with alignment to reference genome NC_045512, none of which were observed in the receptor-binding domain (RBD) of the spike protein. On average, each of the strains could have about 1.75 new mutations each month. The current mutations may have few impacts on antibodies. Although it shows the purifying selection in whole-genome, ORF3a, ORF8 and ORF10 were under positive selection. Only 36 mutations occurred in 1% and more virus strains were further analyzed to reveal linkage disequilibrium (LD) variants and dominant mutations. As a result, we observed five dominant mutations involving three nonsynonymous mutations C28144T, C14408T and A23403G and two synonymous mutations T8782C, and C3037T. These five mutations occurred in almost all strains in April 2020. Besides, we also observed two potential dominant nonsynonymous mutations C1059T and G25563T, which occurred in most of the strains in April 2020. Further functional analysis shows that these mutations decreased protein stability largely, which could lead to a significant reduction of virus virulence. In addition, the A23403G mutation increases the spike-ACE2 interaction and finally leads to the enhancement of its infectivity. All of these proved that the evolution of SARS-CoV-2 is toward the enhancement of infectivity and reduction of virulence.

Download Full-text

Phylogenetic and Haplotype Network Analyses of Diaporthe eres Species in China Based on Sequences of Multiple Loci

Biology ◽

10.3390/biology10030179 ◽

2021 ◽

Vol 10 (3) ◽

pp. 179

Author(s):

Chingchai Chaisiri ◽

Xiangyu Liu ◽

Yang Lin ◽

Yanping Fu ◽

Fuxing Zhu ◽

...

Keyword(s):

Phylogenetic Analysis ◽

Elongation Factor ◽

Haplotype Network ◽

Population Diversity ◽

Plant Diseases ◽

Broad Host Range ◽

Internal Transcribed Spacer Region ◽

Chinese Populations ◽

Network Analyses ◽

Haplotype Networks

Diaporthe eres is considered one of the most important causal agents of many plant diseases, with a broad host range worldwide. In this study, multiple sequences of ribosomal internal transcribed spacer region (ITS), translation elongation factor 1-α gene (EF1-α), beta-tubulin gene (TUB2), calmodulin gene (CAL), and histone-3 gene (HIS) were used for multi-locus phylogenetic analysis. For phylogenetic analysis, maximum likelihood (ML), maximum parsimony (MP), and Bayesian inferred (BI) approaches were performed to investigate relationships of D. eres with closely related species. The results strongly support that the D. eres species falls into a monophyletic lineage, with the characteristics of a species complex. Phylogenetic informativeness (PI) analysis showed that clear boundaries could be proposed by using EF1-α, whereas ITS showed an ineffective reconstruction and, thus, was unsuitable for speciating boundaries for Diaporthe species. A combined dataset of EF1-α, CAL, TUB2, and HIS showed strong resolution for Diaporthe species, providing insights for the D. eres complex. Accordingly, besides D. biguttusis, D. camptothecicola, D. castaneae-mollissimae, D. cotoneastri, D. ellipicola, D. longicicola, D. mahothocarpus, D. momicola, D. nobilis, and Phomopsis fukushii, which have already been previously considered the synonymous species of D. eres, another three species, D. henanensis, D. lonicerae and D. rosicola, were further revealed to be synonyms of D. eres in this study. In order to demonstrate the genetic diversity of D. eres species in China, 138 D. eres isolates were randomly selected from previous studies in 16 provinces. These isolates were obtained from different major plant species from 2006 to 2020. The genetic distance was estimated with phylogenetic analysis and haplotype networks, and it was revealed that two major haplotypes existed in the Chinese populations of D. eres. The haplotype networks were widely dispersed and not uniquely correlated to specific populations. Overall, our analyses evaluated the phylogenetic identification for D. eres species and demonstrated the population diversity of D. eres in China.

Download Full-text

148 Multiple Dysregulated Novel Pathways and Genes in Aleutian Mink Disease Revealed by Selection Signatures and Gene Network Analyses Using Whole-genome Sequence Data

Journal of Animal Science ◽

10.1093/jas/skab235.137 ◽

2021 ◽

Vol 99 (Supplement_3) ◽

pp. 76-76

Author(s):

Seyed Milad Vahedi ◽

Karim Karimi ◽

Siavash Salek Ardestani ◽

Younes Miar

Keyword(s):

Sequence Data ◽

American Mink ◽

Enrichment Analysis ◽

Whole Genome Sequence ◽

Fixation Index ◽

Pathway Enrichment Analysis ◽

Whole Genome ◽

Nucleotide Polymorphisms ◽

Network Analyses ◽

Genome Level

Abstract Aleutian disease (AD) is a chronic persistent infection in domestic mink caused by Aleutian mink disease virus (AMDV). Female mink’s fertility and pelt quality depression are the main reasons for the AD’s negative economic impacts on the mink industry. A total number of 79 American mink from the Canadian Center for Fur Animal Research at Dalhousie University (Truro, NS, Canada) were classified based on the results of counter immunoelectrophoresis (CIEP) tests into two groups of positive (n = 48) and negative (n = 31). Whole-genome sequences comprising 4,176 scaffolds and 8,039,737 single nucleotide polymorphisms (SNPs) were used to trace the selection footprints for response to AMDV infection at the genome level. Window-based fixation index (Fst) and nucleotide diversity (θπ) statistics were estimated to compare positive and negative animals’ genomes. The overlapped top 1% genomic windows between two statistics were considered as potential regions underlying selection pressures. A total of 98 genomic regions harboring 33 candidate genes were detected as selective signals. Most of the identified genes were involved in the development and functions of immune system (PPP3CA, SMAP2, TNFRSF21, SKIL, and AKIRIN2), musculoskeletal system (COL9A2, PPP1R9A, ANK2, AKAP9, and STRIT1), nervous system (ASCL1, ZFP69B, SLC25A27, MCF2, and SLC7A14), reproductive system (CAMK2D, GJB7, SSMEM1, C6orf163), liver (PAH and DPYD), and lung (SLC35A1). Gene-expression network analysis showed the interactions among 27 identified genes. Moreover, pathway enrichment analysis of the constructed genes network revealed significant oxytocin (KEGG: hsa04921) and GnRH signaling (KEGG: hsa04912) pathways, which are likely to be impaired by AMDV leading to dams’ fecundity reduction. These results provided a perspective to the genetic architecture of response to AD in American mink and novel insight into the pathogenesis of AMDV.

Download Full-text

MBRS-59. SINGLE-CELL WHOLE-GENOME SEQUENCING DISSECTS INTRA-TUMOURAL GENOMIC HETEROGENEITY AND CLONAL EVOLUTION IN CHILDHOOD MEDULLOBLASTOMA

Neuro-Oncology ◽

10.1093/neuonc/noaa222.563 ◽

2020 ◽

Vol 22 (Supplement_3) ◽

pp. iii408-iii408

Author(s):

Marina Danilenko ◽

Masood Zaka ◽

Claire Keeling ◽

Stephen Crosier ◽

Rafiqul Hussain ◽

...

Keyword(s):

Whole Genome Sequencing ◽

Single Cell ◽

Genome Sequencing ◽

Copy Number ◽

Single Cell Analysis ◽

Mutational Analysis ◽

Single Cells ◽

Clonal Evolution ◽

Whole Genome ◽

Clinically Significant

Abstract Medulloblastomas harbor clinically-significant intra-tumoral heterogeneity for key biomarkers (e.g. MYC/MYCN, β-catenin). Recent studies have characterized transcriptional heterogeneity at the single-cell level, however the underlying genomic copy number and mutational architecture remains to be resolved. We therefore sought to establish the intra-tumoural genomic heterogeneity of medulloblastoma at single-cell resolution. Copy number patterns were dissected by whole-genome sequencing in 1024 single cells isolated from multiple distinct tumour regions within 16 snap-frozen medulloblastomas, representing the major molecular subgroups (WNT, SHH, Group3, Group4) and genotypes (i.e. MYC amplification, TP53 mutation). Common copy number driver and subclonal events were identified, providing clear evidence of copy number evolution in medulloblastoma development. Moreover, subclonal whole-arm and focal copy number alterations covering important genomic loci (e.g. on chr10 of SHH patients) were detected in single tumour cells, yet undetectable at the bulk-tumor level. Spatial copy number heterogeneity was also common, with differences between clonal and subclonal events detected in distinct regions of individual tumours. Mutational analysis of the cells allowed dissection of spatial and clonal heterogeneity patterns for key medulloblastoma mutations (e.g. CTNNB1, TP53, SMARCA4, PTCH1) within our cohort. Integrated copy number and mutational analysis is underway to establish their inter-relationships and relative contributions to clonal evolution during tumourigenesis. In summary, single-cell analysis has enabled the resolution of common mutational and copy number drivers, alongside sub-clonal events and distinct patterns of clonal and spatial evolution, in medulloblastoma development. We anticipate these findings will provide a critical foundation for future improved biomarker selection, and the development of targeted therapies.

Download Full-text

Genome Sequencing of Polydrug-, Multidrug-, and Extensively Drug-Resistant Mycobacterium tuberculosis Strains from South India

Microbiology Resource Announcements ◽

10.1128/mra.01388-18 ◽

2019 ◽

Vol 8 (12) ◽

Author(s):

Sivakumar Shanmugam ◽

Narender Kumar ◽

Dina Nair ◽

Mohan Natrajan ◽

Srikanth Prasad Tripathy ◽

...

Keyword(s):

Mycobacterium Tuberculosis ◽

Whole Genome Sequencing ◽

Genome Sequencing ◽

South India ◽

Sequence Data ◽

Whole Genome ◽

Drug Resistant ◽

Resistance Mutations ◽

Content Type ◽

Extensively Drug Resistant

The genomes of 16 clinical Mycobacterium tuberculosis isolates were subjected to whole-genome sequencing to identify mutations related to resistance to one or more anti-Mycobacterium drugs. The sequence data will help in understanding the genomic characteristics of M. tuberculosis isolates and their resistance mutations prevalent in South India.

Download Full-text

Parastagonospora nodorum and Related Species in Western Canada: Genetic Variability and Effector Genes

Phytopathology ◽

10.1094/phyto-05-20-0207-r ◽

2020 ◽

Vol 110 (12) ◽

pp. 1946-1958

Author(s):

Mohamed Hafez ◽

Ryan Gourlie ◽

Therese Despins ◽

Thomas K. Turkington ◽

Timothy L. Friesen ◽

...

Keyword(s):

Evolutionary Relationship ◽

Haplotype Network ◽

Bipolaris Sorokiniana ◽

Western Canada ◽

Pyrenophora Tritici Repentis ◽

Septoria Nodorum Blotch ◽

Effector Genes ◽

Parastagonospora Nodorum ◽

Almost All ◽

Necrotrophic Effectors

Parastagonospora nodorum is an important fungal pathogen that causes Septoria nodorum blotch (SNB) in wheat. This pathogen produces several necrotrophic effectors that act as virulence factors; three have been cloned, SnToxA, SnTox1, and SnTox3. In this study, P. nodorum and its sister species P. avenaria f. tritici (Pat1) were isolated from wheat node and grain samples collected from distanced sites in western Canada during 2018. The presence of effector genes and associated haplotypes were determined by PCR and sequence analysis. An internal transcribed spacer-restriction fragment length polymorphism test was developed to distinguish between leaf spotting pathogens (P. nodorum, Pat1, Pyrenophora tritici-repentis, and Bipolaris sorokiniana). P. nodorum was mainly recovered from wheat nodes and to a lesser extent from the grains, while Pat1 was exclusively isolated from grain samples. The effector genes were present in almost all P. nodorum isolates, with the ToxA haplotype 5 (H5) being most prevalent, while a novel ToxA haplotype (denoted here H21) is reported for the first time. In Pat1, only combinations of SnTox1 and SnTox3 genes were present. A ToxA haplotype network was also constructed to assess the evolutionary relationship among globally found haplotypes to date. Finally, cultivars representing wheat development in Canada for the last century were tested for sensitivity to Sn-effectors and to the presence of Tsn1, the ToxA sensitivity gene. Of tested cultivars, 32.9 and 56.9% were sensitive to SnTox1 and SnTox3, respectively, and Tsn1 was present in 59% of the cultivars. In conclusion, P. nodorum and Pat1 were prevalent wheat pathogens in Canada with a potential tissue-specific colonization capacity, while producing necrotrophic effectors to which wheat is sensitive.

Download Full-text

Determination of evolutionary units in European representatives of the crab genus Pilumnus

Open Life Sciences ◽

10.2478/s11535-013-0242-5 ◽

2014 ◽

Vol 9 (1) ◽

pp. 104-113 ◽

Cited By ~ 2

Author(s):

Christoph Schubart ◽

Bianca Aichinger

Keyword(s):

Sequence Data ◽

Mitochondrial Gene ◽

Haplotype Network ◽

Mtdna Sequences ◽

Dna Sequence Data ◽

Coastal Marine ◽

Eastern Atlantic Ocean ◽

Genetic Clusters ◽

Evolutionary Units

AbstractBristle crabs of the genus Pilumnus (Brachyura: Heterotremata: Pilumnidae) are common inhabitants of European waters. They are easily identifiable as a genus, but with the exception of P. inermis, intrageneric classification turns out to be quite complex. There is no general agreement on the number and distinction of species. Therefore, this genus is well-suited for comparative molecular studies. Specimens of the Pilumnus hirtellus complex, here defined as including Pilumnus hirtellus, P. villosissimus, P. spinifer, P. aestuarii, and an undescribed species, were gathered from throughout the Mediterranean Sea and the eastern Atlantic Ocean. DNA sequence data were obtained from the barcoding region of the cytochrome oxidase 1 mitochondrial gene and used for reconstruction of a phylogenetic tree and a haplotype network. The morphology of the gastric ossicles was compared in the search of separating characters. Our results give evidence for five genetic clusters within the P. hirtellus complex. There is negligible geographic variation within these clusters. Unambiguous mtDNA sequences within morphologically variable local populations argue against possible hybridization. The here encountered evolutionary units are relatively young and possibly allow to study ongoing processes of morphological, genetic, and ecological differentiation, leading to speciation and radiations in the coastal marine environment.

Download Full-text