Full length genomic sanger sequencing and phylogenetic analysis of Severe Acute Respiratory Syndrome Coronavirus 2 (SARS-CoV-2) in Nigeria

In an outbreak, effective detection of the aetiological agent(s) involved using molecular techniques is key to efficient diagnosis, early prevention and management of the spread. However, sequencing is necessary for mutation monitoring and tracking of clusters of transmission, development of diagnostics and for vaccines and drug development. Many sequencing methods are fast evolving to reduce test turn-around-time and to increase through-put compared to Sanger sequencing method; however, Sanger sequencing remains the gold standard for clinical research sequencing with its 99.99% accuracy This study sought to generate sequence data of SARS-CoV-2 using Sanger sequencing method and to characterize them for possible site(s) of mutations. About 30 pairs of primers were designed, synthesized, and optimized using endpoint PCR to generate amplicons for the full length of the virus. Cycle sequencing using BigDye Terminator v.3.1 and capillary gel electrophoresis on ABI 3130xl genetic analyser were performed according to the manufacturers’ instructions. The sequence data generated were assembled and analysed for variations using DNASTAR Lasergene 17 SeqMan Ultra. Total length of 29,760bp of SARS-CoV-2 was assembled from the sample analysed and deposited in GenBank with accession number: MT576584. Blast result of the sequence assembly shows a 99.97% identity with the reference sequence. Variations were noticed at positions: nt201, nt2997, nt14368, nt16535, nt20334, and nt28841-28843, which caused amino acid alterations at the S (aa614) and N (aa203-204) regions. The mutations observed at S and N-gene in this study may be indicative of a gradual changes in the genetic coding of the virus hence, the need for active surveillance of the viral genome.

Download Full-text

Investigating Microbial Eukaryotic Diversity from a Global Census: Insights from a Comparison of Pyrotag and Full-Length Sequences of 18S rRNA Genes

Applied and Environmental Microbiology ◽

10.1128/aem.00057-14 ◽

2014 ◽

Vol 80 (14) ◽

pp. 4363-4373 ◽

Cited By ~ 49

Author(s):

Alle A. Y. Lie ◽

Zhenfeng Liu ◽

Sarah K. Hu ◽

Adriane C. Jones ◽

Diane Y. Kim ◽

...

Keyword(s):

Species Richness ◽

Sanger Sequencing ◽

18S Rrna ◽

Sequence Data ◽

Sequence Similarity ◽

Hypervariable Region ◽

Full Length ◽

Rrna Genes ◽

Data Sets ◽

18S Rrna Genes

ABSTRACTNext-generation DNA sequencing (NGS) approaches are rapidly surpassing Sanger sequencing for characterizing the diversity of natural microbial communities. Despite this rapid transition, few comparisons exist between Sanger sequences and the generally much shorter reads of NGS. Operational taxonomic units (OTUs) derived from full-length (Sanger sequencing) and pyrotag (454 sequencing of the V9 hypervariable region) sequences of 18S rRNA genes from 10 global samples were analyzed in order to compare the resulting protistan community structures and species richness. Pyrotag OTUs called at 98% sequence similarity yielded numbers of OTUs that were similar overall to those for full-length sequences when the latter were called at 97% similarity. Singleton OTUs strongly influenced estimates of species richness but not the higher-level taxonomic composition of the community. The pyrotag and full-length sequence data sets had slightly different taxonomic compositions of rhizarians, stramenopiles, cryptophytes, and haptophytes, but the two data sets had similarly high compositions of alveolates. Pyrotag-based OTUs were often derived from sequences that mapped to multiple full-length OTUs at 100% similarity. Thus, pyrotags sequenced from a single hypervariable region might not be appropriate for establishing protistan species-level OTUs. However, nonmetric multidimensional scaling plots constructed with the two data sets yielded similar clusters, indicating that beta diversity analysis results were similar for the Sanger and NGS sequences. Short pyrotag sequences can provide holistic assessments of protistan communities, although care must be taken in interpreting the results. The longer reads (>500 bp) that are now becoming available through NGS should provide powerful tools for assessing the diversity of microbial eukaryotic assemblages.

Download Full-text

Full-length transcriptome sequences of Agropyron cristatum facilitate the prediction of putative genes for thousand-grain weight in a wheat-A. cristatum translocation line

BMC Genomics ◽

10.1186/s12864-019-6416-4 ◽

2019 ◽

Vol 20 (1) ◽

Cited By ~ 1

Author(s):

Shenghui Zhou ◽

Jinpeng Zhang ◽

Haiming Han ◽

Jing Zhang ◽

Huihui Ma ◽

...

Keyword(s):

Single Molecule ◽

Sequence Data ◽

Full Length ◽

Grain Weight ◽

Agropyron Cristatum ◽

Reference Sequence ◽

Translocation Line ◽

Thousand Grain Weight ◽

Sequencing Platform ◽

Transcriptome Sequences

Abstract Background Agropyron cristatum (L.) Gaertn. (2n = 4x = 28; genomes PPPP) is a wild relative of common wheat (Triticum aestivum L.) and provides many desirable genetic resources for wheat improvement. However, there is still a lack of reference genome and transcriptome information for A. cristatum, which severely impedes functional and molecular breeding studies. Results Single-molecule long-read sequencing technology from Pacific Biosciences (PacBio) was used to sequence full-length cDNA from a mixture of leaves, roots, stems and caryopses and constructed the first full-length transcriptome dataset of A. cristatum, which comprised 44,372 transcripts. As expected, the PacBio transcripts were generally longer and more complete than the transcripts assembled via the Illumina sequencing platform in previous studies. By analyzing RNA-Seq data, we identified tissue-enriched transcripts and assessed their GO term enrichment; the results indicated that tissue-enriched transcripts were enriched for particular molecular functions that varied by tissue. We identified 3398 novel and 1352 A. cristatum-specific transcripts compared with the wheat gene model set. To better apply this A. cristatum transcriptome, the A. cristatum transcripts were integrated with the wheat genome as a reference sequence to try to identify candidate A. cristatum transcripts associated with thousand-grain weight in a wheat-A. cristatum translocation line, Pubing 3035. Conclusions Full-length transcriptome sequences were used in our study. The present study not only provides comprehensive transcriptomic insights and information for A. cristatum but also proposes a new method for exploring the functional genes of wheat relatives under a wheat genetic background. The sequence data have been deposited in the NCBI under BioProject accession number PRJNA534411.

Download Full-text

Full-length transcriptome sequences of Agropyron cristatum facilitate the prediction of putative genes for thousand-grain weight in a wheat-A. cristatum translocation line

10.21203/rs.2.9773/v2 ◽

2019 ◽

Author(s):

Shenghui Zhou(Former Corresponding Author) ◽

Jinpeng Zhang ◽

Haiming Han ◽

Jing Zhang ◽

Ma Huihui ◽

...

Keyword(s):

Single Molecule ◽

Sequence Data ◽

Full Length ◽

Grain Weight ◽

Agropyron Cristatum ◽

Reference Sequence ◽

Translocation Line ◽

Thousand Grain Weight ◽

Sequencing Platform ◽

Transcriptome Sequences

Abstract Agropyron cristatum (L.) Gaertn. (2n = 4x = 28; genomes PPPP) is a wild relative of common wheat (Triticum aestivum L.) and provides many desirable genetic resources for wheat improvement. However, there is still a lack of reference genome and transcriptome information for A. cristatum, which severely impedes functional and molecular breeding studies.Results Single-molecule long-read sequencing technology from Pacific Biosciences (PacBio) was used to sequence full-length cDNA from a mixture of leaves, roots, stems and caryopses and constructed the first full-length transcriptome dataset of A. cristatum, which comprised 44,372 transcripts. As expected, the PacBio transcripts were generally longer and more complete than the transcripts assembled via the Illumina sequencing platform in previous studies. By analyzing RNA-Seq data, we identified tissue-enriched transcripts and assessed their GO term enrichment; the results indicated that tissue-enriched transcripts were enriched for particular molecular functions that varied by tissue. We identified 3,398 novel and 1,352 A. cristatum-specific transcripts compared with the wheat gene model set. To better apply this A. cristatum transcriptome, the A. cristatum transcripts were integrated with the wheat genome as a reference sequence to try to identify candidate A. cristatum transcripts associated with thousand-grain weight in a wheat-A. cristatum translocation line, Pubing 3035.Conclusions Full-length transcriptome sequences were used in our study. The present study not only provides comprehensive transcriptomic insights and information for A. cristatum but also proposes a new method for exploring the functional genes of wheat relatives under a wheat genetic background. The sequence data have been deposited in the NCBI under BioProject accession number PRJNA534411.

Download Full-text

Phylogenetic analysis of variable and conserved genomic regions in severe acute respiratory syndrome coronavirus 2 (COVID-19)

10.21203/rs.3.rs-88200/v1 ◽

2020 ◽

Author(s):

Abeer F. El Nahas ◽

Nasema M. Elkatatny ◽

Haitham G. Abo-Al-Ela

Keyword(s):

Phylogenetic Analysis ◽

Amino Acid ◽

Sequence Data ◽

Phylogenetic Analyses ◽

Reference Sequence ◽

N Gene ◽

Conserved Regions ◽

S Gene ◽

Genomic Regions ◽

Conserved Genomic Regions

Abstract SARS-CoV-2 has rapidly spread around the world. Several mutations have been detected in its genome, but they do not seem to affect the abilities of the virus to spread or infect. We aimed to explore the conserved genomic regions in coronavirus that could contain the key strengths of the virus. SARS-CoV-2 sequence data were retrieved from Genbank from the period of December 2019 to March 2020. Phylogenetic analyses were conducted for 207 sequences using MEGAX compared with the reference sequence (MN908947.3- CHN-Wuhan Dec-2019). The analysis included seven important genomic regions, the ORF1ab gene (21,290 bp), S gene (3,822 bp), Orf3a gene (827 bp), E gene (227 bp), M gene (669 bp), and N gene (1,259 bp), which play critical roles in virus invasion and replication. Furthermore, the variant nucleotides and amino acids were detected by MEGAX and BLAST. Through the phylogenetic analysis and amino acid substitution, the ORF1ab gene showed 11 conserved regions and also several variable sites. The E and M genes were mainly conserved, and all sequences were included in one clade, with one or two amino acid variants. Orf3a and the N gene have four conserved sites distributed along the genes. The S gene has 12 mutations and four main large conserved regionsWe conclude that the favored occurrence of mutations at the ORFab and Orf3a genes during the SARS-CoV epidemic is an important mechanism for virus pathogenesis. The E and M proteins have an almost conserved structure, whereas the S and N genes have many conserved regions, which could serve as possible targets for vaccine design for SARS-CoV.

Download Full-text

SDHA and SDHB mutations in KIT/PDGFRA WT gastrointestinal stromal tumors.

Journal of Clinical Oncology ◽

10.1200/jco.2012.30.15_suppl.10087 ◽

2012 ◽

Vol 30 (15_suppl) ◽

pp. 10087-10087 ◽

Cited By ~ 1

Author(s):

Margherita Nannini ◽

Maria A. Pantaleo ◽

Annalisa Astolfi ◽

Milena Urbini ◽

Serena Formica ◽

...

Keyword(s):

Sanger Sequencing ◽

Massively Parallel Sequencing ◽

Heterozygous Mutation ◽

Missense Mutations ◽

Wild Type ◽

Cycle Sequencing ◽

Sequencing Method ◽

Pcr Products ◽

Sdhb Gene ◽

Genetic Analyzer

10087 Background: KIT/PDGFRA wild-type (WT) GISTs harbour mutations on SDHB and SDHC and, more recently, we described mutations on SDHA using massively parallel sequencing approach. We sequenced SDHA and SDHB genes in a larger series in order to validate the data. Methods: SDHA gene (1-15 exons) and SDHB gene (1-8 exons) (even not all exons in all samples) were sequenced on tumor (T) and/or peripheral blood (PB) of WT GIST patients by Sanger Sequencing method. DNA was extracted from tumor specimens by the QIAmp DNA Mini kit (Qiagen, Milan, Italy) and amplified with specific primer pairs designed to amplify exons but not SDHA pseudo-genes located on chromosomes 3 and 5. Then, PCR products were purified with the Qiaquick PCR purification kit (Qiagen, Milan, Italy) and sequenced on both strands using the Big Dye Terminator v1.1 Cycle Sequencing kit (Applied Biosystems). Sanger sequencing was performed on ABI 3730 Genetic Analyzer (Applied Biosystems). Results: SDHA gene exons were sequenced on a total of 27 WT GIST patients, in particular on T, PB and both from 12, 6 and 9 patients respectively. SDHB gene exons were sequenced on a total of 18 out of 27 patients, in particular on T, PB and both from 7, 8 and 3 patients respectively. 8 SDHA mutations were found in 5 samples (18.5%). Besides those previously identified, 5 new SDHA mutations were found in other 3 samples: one sample harboured R171C and R589Q heterozygous missense mutation in exons 5 and 13 respectively. The other one harboured G419R and E564K heterozygous missense mutations in exons 9 and 13 respectively. The third sample harboured a delCAG immediately upstream of exon 5, in heterozygosis on PB and in homozygosis on T. A SDHB heterozygous mutation (301delT) in exon 4 was found on 1 PB sample. Conclusions: the presence of SDHA mutations has been confirmed in a subgroup of WT GIST patients. All subunits of SDH complex should be sequenced on WT GIST patients in order to explore the frequency and any linkage between each other and the pathogenetic and clinical significance.

Download Full-text

Phylogenetic analysis of variable and conserved genomic regions in severe acute respiratory syndrome coronavirus 2 (COVID-19)

10.21203/rs.3.rs-88200/v2 ◽

2020 ◽

Author(s):

Abeer F. El Nahas ◽

Nasema M. Elkatatny ◽

Haitham G. Abo-Al-Ela

Keyword(s):

Phylogenetic Analysis ◽

Amino Acid ◽

Sequence Data ◽

Phylogenetic Analyses ◽

Reference Sequence ◽

N Gene ◽

Conserved Regions ◽

S Gene ◽

Genomic Regions ◽

Conserved Genomic Regions

Download Full-text

Using next generation sequencing of alpine plants to improve fecal metabarcoding diet analysis for Dall’s sheep

BMC Research Notes ◽

10.1186/s13104-021-05590-z ◽

2021 ◽

Vol 14 (1) ◽

Author(s):

Kelly E. Williams ◽

Damian M. Menning ◽

Eric J. Wald ◽

Sandra L. Talbot ◽

Kumi L. Rattenbury ◽

...

Keyword(s):

Sequence Data ◽

Vascular Plant ◽

Alpine Plants ◽

Diet Analysis ◽

Reference Sequence ◽

Reference Library ◽

Ovis Dalli ◽

Plant Animal Interactions ◽

Dall’S Sheep ◽

Northwestern North America

Abstract Objectives Dall’s sheep (Ovis dalli dalli) are important herbivores in the mountainous ecosystems of northwestern North America, and recent declines in some populations have sparked concern. Our aim was to improve capabilities for fecal metabarcoding diet analysis of Dall’s sheep and other herbivores by contributing new sequence data for arctic and alpine plants. This expanded reference library will provide critical reference sequence data that will facilitate metabarcoding diet analysis of Dall’s sheep and thus improve understanding of plant-animal interactions in a region undergoing rapid climate change. Data description We provide sequences for the chloroplast rbcL gene of 16 arctic-alpine vascular plant species that are known to comprise the diet of Dall’s sheep. These sequences contribute to a growing reference library that can be used in diet studies of arctic herbivores.

Download Full-text

Functional Genomic Identification of Cadmium Resistance Genes from a High GC Clone Library by Coupling the Sanger and PacBio Sequencing Strategies

Genes ◽

10.3390/genes11010007 ◽

2019 ◽

Vol 11 (1) ◽

pp. 7

Author(s):

Jinghao Chen ◽

Chao Xing ◽

Xin Zheng ◽

Xiaofang Li

Keyword(s):

High Throughput ◽

Resistance Genes ◽

Sanger Sequencing ◽

Genomic Library ◽

Full Length ◽

Host Cells ◽

Full Genome Sequence ◽

Functional Screening ◽

Functional Genomic ◽

Pacbio Sequencing

Functional (meta) genomics allows the high-throughput identification of functional genes in a premise-free way. However, it is still difficult to perform Sanger sequencing for high GC DNA templates, which hinders the functional genomic exploration of a high GC genomic library. Here, we developed a procedure to resolve this problem by coupling the Sanger and PacBio sequencing strategies. Identification of cadmium (Cd) resistance genes from a small-insert high GC genomic library was performed to test the procedure. The library was generated from a high GC (75.35%) bacterial genome. Nineteen clones that conferred Cd resistance to Escherichia coli subject to Sanger sequencing directly. The positive clones were in parallel subject to in vivo amplification in host cells, from which recombinant plasmids were extracted and linearized by selected restriction endonucleases. PacBio sequencing was performed to obtain the full-length sequences. As the identities, partial sequences from Sanger sequencing were aligned to the full-length sequences from PacBio sequencing, which led to the identification of seven unique full-length sequences. The unique sequences were further aligned to the full genome sequence of the source strain. Functional screening showed that the identified positive clones were all able to improve Cd resistance of the host cells. The functional genomic procedure developed here couples the Sanger and PacBio sequencing methods and overcomes the difficulties in PCR approaches for high GC DNA. The procedure can be a promising option for the high-throughput sequencing of functional genomic libraries, and realize a cost-effective and time-efficient identification of the positive clones, particularly for high GC genetic materials.

Download Full-text

A new Plasmodium vivax reference sequence with improved assembly of the subtelomeres reveals an abundance of pir genes

Wellcome Open Research ◽

10.12688/wellcomeopenres.9876.1 ◽

2016 ◽

Vol 1 ◽

pp. 4 ◽

Cited By ~ 56

Author(s):

Sarah Auburn ◽

Ulrike Böhme ◽

Sascha Steinbiss ◽

Hidayat Trimarsanto ◽

Jessica Hostetler ◽

...

Keyword(s):

South America ◽

Plasmodium Vivax ◽

Sequence Data ◽

Ex Vivo ◽

Vital Role ◽

Asia Pacific ◽

Reference Sequence ◽

Manual Curation ◽

Illumina Sequence

Plasmodium vivax is now the predominant cause of malaria in the Asia-Pacific, South America and Horn of Africa. Laboratory studies of this species are constrained by the inability to maintain the parasite in continuous ex vivo culture, but genomic approaches provide an alternative and complementary avenue to investigate the parasite’s biology and epidemiology. To date, molecular studies of P. vivax have relied on the Salvador-I reference genome sequence, derived from a monkey-adapted strain from South America. However, the Salvador-I reference remains highly fragmented with over 2500 unassembled scaffolds. Using high-depth Illumina sequence data, we assembled and annotated a new reference sequence, PvP01, sourced directly from a patient from Papua Indonesia. Draft assemblies of isolates from China (PvC01) and Thailand (PvT01) were also prepared for comparative purposes. The quality of the PvP01 assembly is improved greatly over Salvador-I, with fragmentation reduced to 226 scaffolds. Detailed manual curation has ensured highly comprehensive annotation, with functions attributed to 58% core genes in PvP01 versus 38% in Salvador-I. The assemblies of PvP01, PvC01 and PvT01 are larger than that of Salvador-I (28-30 versus 27 Mb), owing to improved assembly of the subtelomeres. An extensive repertoire of over 1200 Plasmodium interspersed repeat (pir) genes were identified in PvP01 compared to 346 in Salvador-I, suggesting a vital role in parasite survival or development. The manually curated PvP01 reference and PvC01 and PvT01 draft assemblies are important new resources to study vivax malaria. PvP01 is maintained at GeneDB and ongoing curation will ensure continual improvements in assembly and annotation quality.

Download Full-text

Caractérisation génétique des virus Tilligerry et Mitchell River

Revue d’élevage et de médecine vétérinaire des pays tropicaux ◽

10.19182/remvt.10060 ◽

2009 ◽

Vol 62 (2-4) ◽

pp. 151

Author(s):

M. Belaganahalli ◽

S. Maan ◽

P. P.C. Mertens

Keyword(s):

Reverse Genetics ◽

Sequence Data ◽

Phylogenetic Analyses ◽

Taxonomic Status ◽

Cross Reactivity ◽

Emerging Diseases ◽

Full Length ◽

Virus Species ◽

Livestock Farming

Viruses that are normally safely contained within their host species can emerge due to intense livestock farming, trade, travel, climate change and encroachment of human activities into new environments. The unexpected emergence of bluetongue virus (BTV), the prototype species of the genus Orbivirus, in economically important livestock species (sheep and cattle) across the whole of Europe (since 1998), indicates that other orbiviruses represent a potential further threat to animal and human populations in Europe and elsewhere. The genus Orbivirus is the largest within the family Reoviridae, containing 22 virus species, as well as 14 unclassified orbiviruses, some of which may represent additional or novel species. The orbiviruses are transmitted primarily by arthropod vectors (e.g. Culicoides, mosquitoes or ticks). Viral genome sequence data provide a basis for virus taxonomy and diagnostic test development, and make it possible to address fundamental questions concerning virus biology, pathogenesis, virulence and evolution, that can be further explored in mutation and reverse genetics studies. Genome sequences also provide criteria for the classification of novel isolates within individual Orbivirus species, as well as the identification of different serotypes, topotypes, reassortants and even closely related but distinct virus lineages. Full-length genome characterization of Tilligerry virus (TILV), a member of the Eubenangee virus species, and Mitchell River virus (MRV), a member of the Warrego virus species, have revealed highly conserved 5’ and 3’ terminal hexanucleotide sequences. Phylogenetic analyses of orbivirus T2 ‘sub-core-shell’ protein sequences reinforce the hypothesis that this protein is an important evolutionary marker for these viruses. The T2 protein shows high levels of amino acid (AA) sequence identity (> 91%) within a single Orbivirus species / serogroup, which can be used for species identification. The T2-protein gene has therefore been given priority in sequencing studies. The T2 protein of TILV is closely related to that of Eubenangee virus (~91% identity), confirming that they are both members of the same Eubenangee virus species. Although TILV is reported to be related to BTV in serological assays, the TILV T2 protein shows only 68-70% AA identity to BTV. This supports its current classification within a different serogroup (Eubenangee). Warrego virus and MRV are currently classified as two distinct members (different serotypes) within the Warrego virus species. However, they show only about 79% AA identity in their T2 protein (based on partial sequences). It is therefore considered likely that they could be reclassified as members of distinct Orbivirus species. The taxonomic classification of MRV will be reviewed after generating full length sequences for the entire genomes of both viruses. The taxonomic status of each of these viruses will also be tested further by co-infections and attempts to create reassortants between them (only viruses belonging to the same species can reassort their genome segments). TILV and MRV are the first viruses from their respective serogroups / virus species to be genetically fully characterized, and will provide a basis for the further characterization / identification of additional viruses within each group / species. These data will assist in the development of specific diagnostic assays and potentially in control of emerging diseases. The sequences generated will also help to evaluate current diagnostic [reverse transcriptase - polymerase chain reaction (RT-PCR)] tests for BTV, African horse sickness virus, epizootic haemorrhagic disease virus, etc., in silico, by identifying any possibility of cross reactivity.

Download Full-text