scholarly journals Pango lineage designation and assignment using SARS-CoV-2 spike gene nucleotide sequences

2021 ◽  
Author(s):  
Aine N O'Toole ◽  
Oliver Pybus ◽  
Michael E Abram ◽  
Elizabeth J Kelly ◽  
Andrew Rambaut

More than 2 million SARS-CoV-2 genome sequences have been generated and shared since the start of the COVID-19 pandemic and constitute a vital information source that informs outbreak control, disease surveillance, and public health policy. The Pango dynamic nomenclature is a popular system for classifying and naming genetically-distinct lineages of SARS-CoV-2, including variants of concern, and is based on the analysis of complete or near-complete virus genomes. However, for several reasons, nucleotide sequences may be generated that cover only the spike gene of SARS-CoV-2. It is therefore important to understand how much information about Pango lineage status is contained in spike-only nucleotide sequences. Here we explore how Pango lineages might be reliably designated and assigned to spike-only nucleotide sequences. We survey the genetic diversity of such sequences, and investigate the information they contain about Pango lineage status. Although many lineages, including the main variants of concern, can be identified clearly using spike-only sequences, some spike-only sequences are shared among tens or hundreds of Pango lineages. To facilitate the classification of SARS- CoV-2 lineages using subgenomic sequences we introduce the notion of designating such sequences to a lineage set, which represents the range of Pango lineages that are consistent with the observed mutations in a given spike sequence. These data provide a foundation for the development of software tools that can assign newly-generated spike nucleotide sequences to Pango lineage sets.

2012 ◽  
Vol 93 (11) ◽  
pp. 2387-2398 ◽  
Author(s):  
Samuel R. Dominguez ◽  
Gregory E. Sims ◽  
David E. Wentworth ◽  
Rebecca A. Halpin ◽  
Christine C. Robinson ◽  
...  

This study compared the complete genome sequences of 16 NL63 strain human coronaviruses (hCoVs) from respiratory specimens of paediatric patients with respiratory disease in Colorado, USA, and characterized the epidemiology and clinical characteristics associated with circulating NL63 viruses over a 3-year period. From 1 January 2009 to 31 December 2011, 92 of 9380 respiratory specimens were found to be positive for NL63 RNA by PCR, an overall prevalence of 1 %. NL63 viruses were circulating during all 3 years, but there was considerable yearly variation in prevalence and the month of peak incidence. Phylogenetic analysis comparing the genome sequences of the 16 Colorado NL63 viruses with those of the prototypical hCoV-NL63 and three other NL63 viruses from the Netherlands demonstrated that there were three genotypes (A, B and C) circulating in Colorado from 2005 to 2010, and evidence of recombination between virus strains was found. Genotypes B and C co-circulated in Colorado in 2005, 2009 and 2010, but genotype A circulated only in 2005 when it was the predominant NL63 strain. Genotype C represents a new lineage that has not been described previously. The greatest variability in the NL63 virus genomes was found in the N-terminal domain (NTD) of the spike gene (nt 1–600, aa 1–200). Ten different amino acid sequences were found in the NTD of the spike protein among these NL63 strains and the 75 partial published sequences of NTDs from strains found at different times throughout the world.


Author(s):  
Viola Kurm ◽  
Ilse Houwers ◽  
Claudia E. Coipan ◽  
Peter Bonants ◽  
Cees Waalwijk ◽  
...  

AbstractIdentification and classification of members of the Ralstonia solanacearum species complex (RSSC) is challenging due to the heterogeneity of this complex. Whole genome sequence data of 225 strains were used to classify strains based on average nucleotide identity (ANI) and multilocus sequence analysis (MLSA). Based on the ANI score (>95%), 191 out of 192(99.5%) RSSC strains could be grouped into the three species R. solanacearum, R. pseudosolanacearum, and R. syzygii, and into the four phylotypes within the RSSC (I,II, III, and IV). R. solanacearum phylotype II could be split in two groups (IIA and IIB), from which IIB clustered in three subgroups (IIBa, IIBb and IIBc). This division by ANI was in accordance with MLSA. The IIB subgroups found by ANI and MLSA also differed in the number of SNPs in the primer and probe sites of various assays. An in-silico analysis of eight TaqMan and 11 conventional PCR assays was performed using the whole genome sequences. Based on this analysis several cases of potential false positives or false negatives can be expected upon the use of these assays for their intended target organisms. Two TaqMan assays and two PCR assays targeting the 16S rDNA sequence should be able to detect all phylotypes of the RSSC. We conclude that the increasing availability of whole genome sequences is not only useful for classification of strains, but also shows potential for selection and evaluation of clade specific nucleic acid-based amplification methods within the RSSC.


2016 ◽  
Vol 4 (2) ◽  
Author(s):  
Akira Yusa ◽  
Nozomu Iwabuchi ◽  
Hiroaki Koinuma ◽  
Takuya Keima ◽  
Yutaro Neriya ◽  
...  

Hydrangea ringspot virus (HdRSV) is a plant RNA virus, naturally infecting Hydrangea macrophylla . Here, we report the first genomic sequences of two HdRSV isolates from hydrangea plants in Japan. The overall nucleotide sequences of these Japanese isolates were 96.0 to 96.3% identical to those of known European isolates.


2021 ◽  
Vol 15 ◽  
Author(s):  
Gianluca Susi ◽  
Luis F. Antón-Toro ◽  
Fernando Maestú ◽  
Ernesto Pereda ◽  
Claudio Mirasso

The recent “multi-neuronal spike sequence detector” (MNSD) architecture integrates the weight- and delay-adjustment methods by combining heterosynaptic plasticity with the neurocomputational feature spike latency, representing a new opportunity to understand the mechanisms underlying biological learning. Unfortunately, the range of problems to which this topology can be applied is limited because of the low cardinality of the parallel spike trains that it can process, and the lack of a visualization mechanism to understand its internal operation. We present here the nMNSD structure, which is a generalization of the MNSD to any number of inputs. The mathematical framework of the structure is introduced, together with the “trapezoid method,” that is a reduced method to analyze the recognition mechanism operated by the nMNSD in response to a specific input parallel spike train. We apply the nMNSD to a classification problem previously faced with the classical MNSD from the same authors, showing the new possibilities the nMNSD opens, with associated improvement in classification performances. Finally, we benchmark the nMNSD on the classification of static inputs (MNIST database) obtaining state-of-the-art accuracies together with advantageous aspects in terms of time- and energy-efficiency if compared to similar classification methods.


2021 ◽  
pp. 124-131
Author(s):  
Marina Zyryanova

This article presents the classification of fakes on grounds of the information source that underlies the occurrence of false information. The study was perfomed on the coronavirus fakes that spread in Russian Federation in March 2020 during the beginning of the coronavirus pandemic in our country. For the analysis, only those fakes were taken, which the Administrations of the Russian regions promptly denied in their official accounts on social networks. Based on this, only those fakes that caused the greatest public response were selected for analysis. In this article, the following types of fakes are distinguished: folklore, symmetric, interpretive, additional, and conspiracy. Folklore fakes in various variations reproduce the same motives and are associated with well-established ideas and stereotypes in the mass consciousness. Symmetrical fakes partially or completely transfer true facts from one territory (country, region) to another. They can also transfer information from one person (structure) to another (s). Interpretative fakes are associated with the incorrect interpretation of events, information disseminated, or decisions made by the authorities by individual individuals. Additional fakes for a short period of time continue the theme of previously thrown disinformation. Conspiracy fakes are associated with conspiracy theory, characterized by stuffing on a wide territory and a large audience This classification is not exhaustive and can be supplemented as new fakes appear and are studied. Also, within the framework of this article, recommendations are given on how to refute a particular fake, depending on its belonging to a particular type.


2018 ◽  
Vol 8 (2) ◽  
pp. 20170039 ◽  
Author(s):  
Zhan Li ◽  
Michael Schaefer ◽  
Alan Strahler ◽  
Crystal Schaaf ◽  
David Jupp

The Dual-Wavelength Echidna Lidar (DWEL), a full waveform terrestrial laser scanner (TLS), has been used to scan a variety of forested and agricultural environments. From these scanning campaigns, we summarize the benefits and challenges given by DWEL's novel coaxial dual-wavelength scanning technology, particularly for the three-dimensional (3D) classification of vegetation elements. Simultaneous scanning at both 1064 nm and 1548 nm by DWEL instruments provides a new spectral dimension to TLS data that joins the 3D spatial dimension of lidar as an information source. Our point cloud classification algorithm explores the utilization of both spectral and spatial attributes of individual points from DWEL scans and highlights the strengths and weaknesses of each attribute domain. The spectral and spatial attributes for vegetation element classification each perform better in different parts of vegetation (canopy interior, fine branches, coarse trunks, etc.) and under different vegetation conditions (dead or live, leaf-on or leaf-off, water content, etc.). These environmental characteristics of vegetation, convolved with the lidar instrument specifications and lidar data quality, result in the actual capabilities of spectral and spatial attributes to classify vegetation elements in 3D space. The spectral and spatial information domains thus complement each other in the classification process. The joint use of both not only enhances the classification accuracy but also reduces its variance across the multiple vegetation types we have examined, highlighting the value of the DWEL as a new source of 3D spectral information. Wider deployment of the DWEL instruments is in practice currently held back by challenges in instrument development and the demands of data processing required by coaxial dual- or multi-wavelength scanning. But the simultaneous 3D acquisition of both spectral and spatial features, offered by new multispectral scanning instruments such as the DWEL, opens doors to study biophysical and biochemical properties of forested and agricultural ecosystems at more detailed scales.


Author(s):  
Luis M. Rodriguez-R ◽  
Ramon Rosselló-Móra ◽  
Konstantinos T. Konstantinidis

Abstract This book chapter attempts to summarize the major findings from genome-based taxonomic studies in the past two decades, and briefly describe the major genome-based approaches currently available for species identification and classification with special focus on the 'uncultivated majority' and associated limitations, as well as outlines future directions towards a truly genome-based taxonomy for prokaryotes that will equally encompass cultured and uncultivated taxa. Importantly, the need for a system to catalogue uncultivated taxa is very urgent, because the genomes and ecological/functional data that are becoming available are already overwhelming, and alphanumeric identifiers and synonyms are creating confusion of Babylonian dimensions.


2020 ◽  
Vol 9 (9) ◽  
pp. 499
Author(s):  
Melanie Brauchler ◽  
Johannes Stoffels

Up-to-date information about the type and spatial distribution of forests is an essential element in both sustainable forest management and environmental monitoring and modelling. The OpenStreetMap (OSM) database contains vast amounts of spatial information on natural features, including forests (landuse=forest). The OSM data model includes describing tags for its contents, i.e., leaf type for forest areas (i.e., leaf_type=broadleaved). Although the leaf type tag is common, the vast majority of forest areas are tagged with the leaf type mixed, amounting to a total area of 87% of landuse=forests from the OSM database. These areas comprise an important information source to derive and update forest type maps. In order to leverage this information content, a methodology for stratification of leaf types inside these areas has been developed using image segmentation on aerial imagery and subsequent classification of leaf types. The presented methodology achieves an overall classification accuracy of 85% for the leaf types needleleaved and broadleaved in the selected forest areas. The resulting stratification demonstrates that through approaches, such as that presented, the derivation of forest type maps from OSM would be feasible with an extended and improved methodology. It also suggests an improved methodology might be able to provide updates of leaf type to the OSM database with contributor participation.


1989 ◽  
Vol 86 (18) ◽  
pp. 7059-7062 ◽  
Author(s):  
E. Orito ◽  
M. Mizokami ◽  
Y. Ina ◽  
E. N. Moriyama ◽  
N. Kameshima ◽  
...  

Sign in / Sign up

Export Citation Format

Share Document