scholarly journals Large swathes of dark-matter (not-sequenced) in the SARS-Cov2 spike protein in significant number of samples in GISAID - probably due to ARTIC-primer artifacts - which will mask real mutations in these genomic regions, and where/when some mutations arose.

2021 ◽  
Author(s):  
Sandeep Chakraborty

The Covid19 pandemic caused by the severe acute respiratory syndrome coronavirus 2 (SARS-Cov2 [1, 2]) has caused significant mortality globally [3], along with severe socio-economic damage [4, 5]. Many vaccines have been given emergency authorization in different countries [6,7]. Mutations raise concerns about these vaccines efficiencies [8] and re-infections [9]. Genome sequencing has been deployed globally to analyze these variants [10,11]. Among different methods, amplicon sequencing using a set of ∼ 100 primers (ARTIC) was adopted in early Jan 2020 (https://www.protocols.io/view/ncov-2019-sequencing-protocol- bbmuik6w). However, subsequent studies found that, while clinical samples with relatively high viral loads had no amplification bias, with lower viral loads there was a significant decrease in abundances of several amplicons [12,13]. This led to newer versions of these primers, the current one being V3.Here, I report large swathes of *dark matter* (not sequenced) in multiple parts of the spike protein - these exact protein sequences occur in different countries in different time-frames, upto the latest data submitted from South Africa about the B.1.351 variant (Accid:PRJNA694014) [14]. While these are ARTIC-primer artifacts, real mutations in these genomic regions will escape detection. Also, this will give us a wrong estimate of when certain mutations actually arose in the population - and in which country.

Author(s):  
Sunil Raghav ◽  
Arup Ghosh ◽  
Jyotirmayee Turuk ◽  
Sugandh Kumar ◽  
Atimukta Jha ◽  
...  

AbstractCOVID-19 that emerged as a global pandemic is caused by SARS-CoV-2 virus. The virus genome analysis during disease spread reveals about its evolution and transmission. We did whole genome sequencing of 225 clinical strains from the state of Odisha in eastern India using ARTIC protocol-based amplicon sequencing. Phylogenetic analysis identified the presence of all five reported clades 19A, 19B, 20A, 20B and 20C in the population. The analyses revealed two major routes for the introduction of the disease in India i.e. Europe and South-east Asia followed by local transmission. Interestingly, 19B clade was found to be much more prevalent in our sequenced genomes (17%) as compared to other genomes reported so far from India. The haplogroup analysis for clades showed evolution of 19A and 19B in parallel whereas the 20B and 20C appeared to evolve from 20A. Majority of the 19A and 19B clades were present in cases that migrated from Gujarat state in India suggesting it to be one of the major initial points of disease transmission in India during month of March and April. We found that with the time 20A and 20B clades evolved drastically that originated from central Europe. At the same time, it has been observed that 20A and 20B clades depicted selection of four common mutations i.e. 241 C>T (5’UTR), P323L in RdRP, F942F in NSP3 and D614G in the spike protein. We found an increase in the concordance of G614 mutation evolution with the viral load in clinical samples as evident from decreased Ct value of spike and Orf1ab gene in qPCR. Molecular modelling and docking analysis identified that D614G mutation enhanced interaction of spike with TMPRSS2 protease, which could impact the shedding of S1 domain and infectivity of the virus in host cells.


Author(s):  
Kentaro Itokawa ◽  
Tsuyoshi Sekizuka ◽  
Masanori Hashino ◽  
Rina Tanaka ◽  
Makoto Kuroda

AbstractSince December 2019, the coronavirus disease 2019 (COVID-19) caused by a novel coronavirus SARS-CoV-2 has rapidly spread to almost every nation in the world. Soon after the pandemic was recognized by epidemiologists, a group of biologists comprising the ARTIC Network, has devised a multiplexed polymerase chain reaction (PCR) protocol and primer set for targeted whole-genome amplification of SARS-CoV-2. The ARTIC primer set amplifies 98 amplicons, which are separated only in two PCRs, across a nearly entire viral genome. The original primer set and protocol showed a fairly small amplification bias when clinical samples with relatively high viral loads were used. However, when sample’s viral load was low, several amplicons, especially amplicons 18 and 76, exhibited low coverage or complete dropout. We have determined that these dropouts were due to a dimer formation between the forward primer for amplicon 18, 18_LEFT, and the reverse primer for amplicon 76, 76_RIGHT. Replacement of 76_RIGHT with an alternatively designed primer was sufficient to produce a drastic improvement in coverage of both amplicons. Based on this result, we replaced 12 primers in total in the ARTIC primer set that were predicted to be involved in 14 primer interactions. The resulting primer set, version N1 (NIID-1), exhibits improved overall coverage compared to the ARTIC Network’s original (V1) and modified (V3) primer set.


Science ◽  
2021 ◽  
Vol 372 (6539) ◽  
pp. eabg0821 ◽  
Author(s):  
Katrina A. Lythgoe ◽  
Matthew Hall ◽  
Luca Ferretti ◽  
Mariateresa de Cesare ◽  
George MacIntyre-Cockett ◽  
...  

Extensive global sampling and sequencing of the pandemic virus severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) have enabled researchers to monitor its spread and to identify concerning new variants. Two important determinants of variant spread are how frequently they arise within individuals and how likely they are to be transmitted. To characterize within-host diversity and transmission, we deep-sequenced 1313 clinical samples from the United Kingdom. SARS-CoV-2 infections are characterized by low levels of within-host diversity when viral loads are high and by a narrow bottleneck at transmission. Most variants are either lost or occasionally fixed at the point of transmission, with minimal persistence of shared diversity, patterns that are readily observable on the phylogenetic tree. Our results suggest that transmission-enhancing and/or immune-escape SARS-CoV-2 variants are likely to arise infrequently but could spread rapidly if successfully transmitted.


2021 ◽  
Vol 17 (1) ◽  
Author(s):  
Jacqueline King ◽  
Anne Pohlmann ◽  
Kamila Dziadek ◽  
Martin Beer ◽  
Kerstin Wernike

Abstract Background As a global ruminant pathogen, bovine viral diarrhea virus (BVDV) is responsible for the disease Bovine Viral Diarrhea with a variety of clinical presentations and severe economic losses worldwide. Classified within the Pestivirus genus, the species Pestivirus A and B (syn. BVDV-1, BVDV-2) are genetically differentiated into 21 BVDV-1 and four BVDV-2 subtypes. Commonly, the 5’ untranslated region and the Npro protein are utilized for subtyping. However, the genetic variability of BVDV leads to limitations in former studies analyzing genome fragments in comparison to a full-genome evaluation. Results To enable rapid and accessible whole-genome sequencing of both BVDV-1 and BVDV-2 strains, nanopore sequencing of twelve representative BVDV samples was performed on amplicons derived through a tiling PCR procedure. Covering a multitude of subtypes (1b, 1d, 1f, 2a, 2c), sample matrices (plasma, EDTA blood and ear notch), viral loads (Cq-values 19–32) and species (cattle and sheep), ten of the twelve samples produced whole genomes, with two low titre samples presenting 96 % genome coverage. Conclusions Further phylogenetic analysis of the novel sequences emphasizes the necessity of whole-genome sequencing to identify novel strains and supplement lacking sequence information in public repositories. The proposed amplicon-based sequencing protocol allows rapid, inexpensive and accessible obtainment of complete BVDV genomes.


Author(s):  
Harsha Doddapaneni ◽  
Sara Javornik Cregeen ◽  
Richard Sucgang ◽  
Qingchang Meng ◽  
Xiang Qin ◽  
...  

AbstractThe newly emerged and rapidly spreading SARS-CoV-2 causes coronavirus disease 2019 (COVID-19). To facilitate a deeper understanding of the viral biology we developed a capture sequencing methodology to generate SARS-CoV-2 genomic and transcriptome sequences from infected patients. We utilized an oligonucleotide probe-set representing the full-length genome to obtain both genomic and transcriptome (subgenomic open reading frames [ORFs]) sequences from 45 SARS-CoV-2 clinical samples with varying viral titers. For samples with higher viral loads (cycle threshold value under 33, based on the CDC qPCR assay) complete genomes were generated. Analysis of junction reads revealed regions of differential transcriptional activity and provided evidence of expression of ORF10. Heterogeneous allelic frequencies along the 20kb ORF1ab gene suggested the presence of a defective interfering viral RNA species subpopulation in one sample. The associated workflow is straightforward, and hybridization-based capture offers an effective and scalable approach for sequencing SARS-CoV-2 from patient samples.


2018 ◽  
Author(s):  
David H Wyllie ◽  
Nicholas Sanderson ◽  
Richard Myers ◽  
Tim Peto ◽  
Esther Robinson ◽  
...  

ABSTRACTContact tracing requires reliable identification of closely related bacterial isolates. When we noticed the reporting of artefactual variation between M. tuberculosis isolates during routine next generation sequencing of Mycobacterium spp, we investigated its basis in 2,018 consecutive M. tuberculosis isolates. In the routine process used, clinical samples were decontaminated and inoculated into broth cultures; from positive broth cultures DNA was extracted, sequenced, reads mapped, and consensus sequences determined. We investigated the process of consensus sequence determination, which selects the most common nucleotide at each position. Having determined the high-quality read depth and depth of minor variants across 8,006 M. tuberculosis genomic regions, we quantified the relationship between the minor variant depth and the amount of non-Mycobacterial bacterial DNA, which originates from commensal microbes killed during sample decontamination. In the presence of non-Mycobacterial bacterial DNA, we found significant increases in minor variant frequencies of more than 1.5 fold in 242 regions covering 5.1% of the M. tuberculosis genome. Included within these were four high variation regions strongly influenced by the amount of non-Mycobacterial bacterial DNA. Excluding these four regions from pairwise distance comparisons reduced biologically implausible variation from 5.2% to 0% in an independent validation set derived from 226 individuals. Thus, we have demonstrated an approach identifying critical genomic regions contributing to clinically relevant artefactual variation in bacterial similarity searches. The approach described monitors the outputs of the complex multi-step laboratory and bioinformatics process, allows periodic process adjustments, and will have application to quality control of routine bacterial genomics.


2021 ◽  
Author(s):  
Pritha Ghosh ◽  
Rohit Suratekar ◽  
Michiel J.M. Niesen ◽  
Praveen Anand ◽  
Gregory Donadio ◽  
...  

The highly contagious Delta variant of SARS-CoV-2 has emerged as the new dominant global strain, and reports of reduced effectiveness of COVID-19 vaccines against the Delta variant are highly concerning. While there has been extensive focus on understanding the amino acid mutations in the Delta variant's Spike protein, the mutational landscape of the rest of the SARS-CoV-2 proteome (25 proteins) remains poorly understood. To this end, we performed a systematic analysis of mutations in all the SARS-CoV-2 proteins from nearly 2 million SARS-CoV-2 genomes from 176 countries/territories. Six highly-prevalent missense mutations in the viral life cycle-associated Membrane (I82T), Nucleocapsid (R203M, D377Y), NS3 (S26L), and NS7a (V82A, T120I) proteins are almost exclusive to the Delta variant compared to other variants of concern (mean prevalence across genomes: Delta = 99.74%, Alpha = 0.06%, Beta = 0.09%, Gamma = 0.22%). Furthermore, we find that the Delta variant harbors a more diverse repertoire of mutations across countries compared to the previously dominant Alpha variant (cosine similarity: meanAlpha = 0.94, S.D.Alpha = 0.05; meanDelta = 0.86, S.D.Delta = 0.1; Cohen's dAlpha-Delta = 1.17, p-value < 0.001). Overall, our study underscores the high diversity of the Delta variant between countries and identifies a list of targetable amino acid mutations in the Delta variant's proteome for probing the mechanistic basis of pathogenic features such as high viral loads, high transmissibility, and reduced susceptibility against neutralization by vaccines.


Author(s):  
Yasuhisa Iwao ◽  
Shuichi Mori ◽  
Manabu Ato ◽  
Noboru Nakata

Mycobacterium leprae is the predominant cause of leprosy worldwide, and its genotypes can be classified into four single nucleotide polymorphism (SNP) types and 16 subtypes. Determining M. leprae drug resistance and genotype is typically done by PCR and Sanger DNA sequencing, which require substantial effort. Here we describe a rapid method involving multiplex PCR in combination with nested amplification and next generation sequence analysis that allows simultaneous determination of M. leprae drug resistance and SNP genotype directly from clinical specimens. We used this method to analyze clinical samples from two paucibacillary, nine multibacillary, and six type-undetermined leprosy patients. Regions in folP1 , rpoB , gyrA , and gyrB that determine drug resistance and those for 84 SNP-InDels in the M. leprae genome were amplified from clinical samples and their sequences were determined. The results showed that seven samples were subtype 1A, three were 1D, and seven were 3K. Three samples of the subtype 3K had folp1 mutation. The method may allow more rapid genetic analyses of M. leprae in clinical samples.


Author(s):  
Ke Wang ◽  
Wei Chen ◽  
Zheng Zhang ◽  
Yongqiang Deng ◽  
Jian-Qi Lian ◽  
...  

AbstractIn face of the everlasting battle toward COVID-19 and the rapid evolution of SARS-CoV-2, no specific and effective drugs for treating this disease have been reported until today. Angiotensin-converting enzyme 2 (ACE2), a receptor of SARS-CoV-2, mediates the virus infection by binding to spike protein. Although ACE2 is expressed in the lung, kidney, and intestine, its expressing levels are rather low, especially in the lung. Considering the great infectivity of COVID-19, we speculate that SARS-CoV-2 may depend on other routes to facilitate its infection. Here, we first discover an interaction between host cell receptor CD147 and SARS-CoV-2 spike protein. The loss of CD147 or blocking CD147 in Vero E6 and BEAS-2B cell lines by anti-CD147 antibody, Meplazumab, inhibits SARS-CoV-2 amplification. Expression of human CD147 allows virus entry into non-susceptible BHK-21 cells, which can be neutralized by CD147 extracellular fragment. Viral loads are detectable in the lungs of human CD147 (hCD147) mice infected with SARS-CoV-2, but not in those of virus-infected wild type mice. Interestingly, virions are observed in lymphocytes of lung tissue from a COVID-19 patient. Human T cells with a property of ACE2 natural deficiency can be infected with SARS-CoV-2 pseudovirus in a dose-dependent manner, which is specifically inhibited by Meplazumab. Furthermore, CD147 mediates virus entering host cells by endocytosis. Together, our study reveals a novel virus entry route, CD147-spike protein, which provides an important target for developing specific and effective drug against COVID-19.


Processes ◽  
2020 ◽  
Vol 8 (11) ◽  
pp. 1425
Author(s):  
Xin Xie ◽  
Tamara Gjorgjieva ◽  
Zaynoun Attieh ◽  
Mame Massar Dieng ◽  
Marc Arnoux ◽  
...  

A major challenge in controlling the COVID-19 pandemic is the high false-negative rate of the commonly used RT-PCR methods for SARS-CoV-2 detection in clinical samples. Accurate detection is particularly challenging in samples with low viral loads that are below the limit of detection (LoD) of standard one- or two-step RT-PCR methods. In this study, we implemented a three-step approach for SARS-CoV-2 detection and quantification that employs reverse transcription, targeted cDNA preamplification, and nano-scale qPCR based on a commercially available microfluidic chip. Using SARS-CoV-2 synthetic RNA and plasmid controls, we demonstrate that the addition of a preamplification step enhances the LoD of this microfluidic RT-qPCR by 1000-fold, enabling detection below 1 copy/µL. We applied this method to analyze 182 clinical NP swab samples previously diagnosed using a standard RT-qPCR protocol (91 positive, 91 negative) and demonstrate reproducible and quantitative detection of SARS-CoV-2 over five orders of magnitude (<1 to 106 viral copies/µL). Crucially, we detect SARS-CoV-2 with relatively low viral load estimates (<1 to 40 viral copies/µL) in 17 samples with negative clinical diagnosis, indicating a potential false-negative rate of 18.7% by clinical diagnostic procedures. In summary, this three-step nano-scale RT-qPCR method can robustly detect SARS-CoV-2 in samples with relatively low viral loads (<1 viral copy/µL) and has the potential to reduce the false-negative rate of standard RT-PCR-based diagnostic tests for SARS-CoV-2 and other viral infections.


Sign in / Sign up

Export Citation Format

Share Document