scholarly journals CoV-GLUE: A Web Application for Tracking SARS-CoV-2 Genomic Variation

Author(s):  
Joshua Singer ◽  
Robert Gifford ◽  
Matthew Cotten ◽  
David Robertson

Summary CoV-GLUE is an online web application for the interpretation and analysis of SARS-CoV-2 virus genome sequences, with a focus on amino acid sequence variation. It is based on the GLUE data-centric bioinformatics environment and provides a browsable database of amino acid replacements and coding region indels that have been observed in sequences from the pandemic. Users may also analyse their own SARS-CoV-2 sequences by submitting them to the web application to receive an interactive report containing visualisations of phylogenetic classification and highlighting genomic variation of potentially high impact, for example linked to primer mismatches.Availability and implementation Available at http://cov-glue.cvr.gla.ac.uk. Implemented using GLUE, an open source framework for the development of virus sequence data resources. Contact [email protected]

2018 ◽  
Author(s):  
Joshua B Singer ◽  
Emma C Thomson ◽  
John McLauchlan ◽  
Joseph Hughes ◽  
Robert J Gifford

AbstractBackgroundVirus genome sequences, generated in ever-higher volumes, can provide new scientific insights and inform our responses to epidemics and outbreaks. To facilitate interpretation, such data must be organised and processed within scalable computing resources that encapsulate virology expertise. GLUE (Genes Linked by Underlying Evolution) is a data-centric bioinformatics environment for building such resources. The GLUE core data schema organises sequence data along evolutionary lines, capturing not only nucleotide data but associated items such as alignments, genotype definitions, genome annotations and motifs. Its flexible design emphasises applicability to different viruses and to diverse needs within research, clinical or public health contexts.ResultsHCV-GLUE is a case study GLUE resource for hepatitis C virus (HCV). It includes an interactive public web application providing sequence analysis in the form of a maximum-likelihood-based genotyping method, antiviral resistance detection and graphical sequence visualisation. HCV sequence data from GenBank is categorised and stored in a large-scale sequence alignment which is accessible via web-based queries. Whereas this web resource provides a range of basic functionality, the underlying GLUE project can also be downloaded and extended by bioinformaticians addressing more advanced questions.ConclusionGLUE can be used to rapidly develop virus sequence data resources with public health, research and clinical applications. This streamlined approach, with its focus on reuse, will help realise the full value of virus sequence data.


Zygote ◽  
2002 ◽  
Vol 10 (4) ◽  
pp. 291-299 ◽  
Author(s):  
Christine A. Swann ◽  
Rory M. Hope ◽  
William G. Breed

This comparative study of the cDNA sequence of the zona pellucida C (ZPC) glycoprotein in murid rodents focuses on the nucleotide and amino acid sequence of the putative sperm-combining site. We ask the question: Has divergence evolved in the nucleotide sequence of ZPC in the murid rodents of Australia? Using RT-PCR and (RACE) PCR, the complete cDNA coding region of ZPC in the Australian hydromyine rodents Notomys alexis and Pseudomys australis, and a partial cDNA sequence from a third hydromyine rodent, Hydromys chrysogaster, has been determined. Comparison between the cDNA sequences of the hydromyine rodents reveals that the level of amino acid sequence identity between N. alexis and P. australis is 96%, whereas that between the two species of hydromyine rodents and M. musculus and R. norvegicus is 88% and 87% respectively. Despite being reproductively isolated from each other, the three species of hydromyine rodents have a 100% level of amino acid sequence identity at the putative sperm-combining site. This finding does not support the view that this site is under positive selective pressure. The sequence data obtained in this study may have important conservation implications for the dissemination of immunocontraception directed against M. musculus using ZPC antibodies.


2007 ◽  
Vol 88 (4) ◽  
pp. 1288-1294 ◽  
Author(s):  
Giulietta Venturi ◽  
Massimo Ciccozzi ◽  
Stefania Montieri ◽  
Alessandro Bartoloni ◽  
Daniela Francisci ◽  
...  

Twenty-seven strains of Toscana virus, collected over a period of 23 years and isolated from several localities and from different hosts (humans, arthropods and a bat), were investigated by sequencing of a portion of the M genomic segment comprising the GN glycoprotein coding region. Sequence data indicated that the divergence among isolates ranged from 0 to 5.7 % at the nucleotide level and from 0 to 3.4 % at the amino acid level. Phylogenetic analysis revealed four main clusters. A close correspondence between viral strains and area/year of isolation could not be demonstrated, whilst co-circulation of different viral strains in the same area and in the same time period was observed for both patients and environmental viral isolates. Alignment of the deduced amino acid sequences and evolutionary analysis indicated that most of the sites along the gene may be invariable because of purifying and/or neutral selection.


2000 ◽  
Vol 182 (11) ◽  
pp. 3266-3273 ◽  
Author(s):  
Guolu Zheng ◽  
Robin Hehn ◽  
Peter Zuber

ABSTRACT The Bacillus subtilis 168 derivative JH642 produces a bacteriocin, subtilosin, which possesses activity againstListeria monocytogenes. Inspection of the amino acid sequence of the presubtilosin polypeptide encoded by the genesboA and sequence data from analysis of mature subtilosin indicate that the precursor subtilosin peptide undergoes several unique and unusual chemical modifications during its maturation process. The genes of the sbo-alb operon are believed to function in the synthesis and maturation of subtilosin. Nonpolar mutations introduced into each of the alb genes resulted in loss or reduction of subtilosin production. sboA, albA, andalbF mutants showed no antilisterial activity, indicating that the products of these genes are critical for the production of active subtilosin. Mutations in albB, -C, and -D resulted in reduction of antilisterial activity and decreased immunity to subtilosin, particularly under anaerobic conditions. A new gene, sboX, encoding another bacteriocin-like product was discovered residing in a sequence overlapping the coding region of sboA. Construction of ansboX-lacZ translational fusion and analysis of its expression indicate that sboX is induced in stationary phase of anaerobic cultures of JH642. An in-frame deletion of thesboX coding sequence did not affect the antilisterial activity or production of or immunity to subtilosin. The results of this investigation show that the sbo-alb genes are required for the mechanisms of subtilosin synthesis and immunity.


2020 ◽  
Vol 15 ◽  
Author(s):  
Affan Alim ◽  
Abdul Rafay ◽  
Imran Naseem

Background: Proteins contribute significantly in every task of cellular life. Their functions encompass the building and repairing of tissues in human bodies and other organisms. Hence they are the building blocks of bones, muscles, cartilage, skin, and blood. Similarly, antifreeze proteins are of prime significance for organisms that live in very cold areas. With the help of these proteins, the cold water organisms can survive below zero temperature and resist the water crystallization process which may cause the rupture in the internal cells and tissues. AFP’s have attracted attention and interest in food industries and cryopreservation. Objective: With the increase in the availability of genomic sequence data of protein, an automated and sophisticated tool for AFP recognition and identification is in dire need. The sequence and structures of AFP are highly distinct, therefore, most of the proposed methods fail to show promising results on different structures. A consolidated method is proposed to produce the competitive performance on highly distinct AFP structure. Methods: In this study, we propose to use machine learning-based algorithms Principal Component Analysis (PCA) followed by Gradient Boosting (GB) for antifreeze protein identification. To analyze the performance and validation of the proposed model, various combinations of two segments composition of amino acid and dipeptide are used. PCA, in particular, is proposed to dimension reduction and high variance retaining of data which is followed by an ensemble method named gradient boosting for modelling and classification. Results: The proposed method obtained the superfluous performance on PDB, Pfam and Uniprot dataset as compared with the RAFP-Pred method. In experiment-3, by utilizing only 150 PCA components a high accuracy of 89.63 was achieved which is superior to the 87.41 utilizing 300 significant features reported for the RAFP-Pred method. Experiment-2 is conducted using two different dataset such that non-AFP from the PISCES server and AFPs from Protein data bank. In this experiment-2, our proposed method attained high sensitivity of 79.16 which is 12.50 better than state-of-the-art the RAFP-pred method. Conclusion: AFPs have a common function with distinct structure. Therefore, the development of a single model for different sequences often fails to AFPs. A robust results have been shown by our proposed model on the diversity of training and testing dataset. The results of the proposed model outperformed compared to the previous AFPs prediction method such as RAFP-Pred. Our model consists of PCA for dimension reduction followed by gradient boosting for classification. Due to simplicity, scalability properties and high performance result our model can be easily extended for analyzing the proteomic and genomic dataset.


Author(s):  
Liang Cheng ◽  
Xudong Han ◽  
Zijun Zhu ◽  
Changlu Qi ◽  
Ping Wang ◽  
...  

Abstract Since the first report of severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) in December 2019, the COVID-19 pandemic has spread rapidly worldwide. Due to the limited virus strains, few key mutations that would be very important with the evolutionary trends of virus genome were observed in early studies. Here, we downloaded 1809 sequence data of SARS-CoV-2 strains from GISAID before April 2020 to identify mutations and functional alterations caused by these mutations. Totally, we identified 1017 nonsynonymous and 512 synonymous mutations with alignment to reference genome NC_045512, none of which were observed in the receptor-binding domain (RBD) of the spike protein. On average, each of the strains could have about 1.75 new mutations each month. The current mutations may have few impacts on antibodies. Although it shows the purifying selection in whole-genome, ORF3a, ORF8 and ORF10 were under positive selection. Only 36 mutations occurred in 1% and more virus strains were further analyzed to reveal linkage disequilibrium (LD) variants and dominant mutations. As a result, we observed five dominant mutations involving three nonsynonymous mutations C28144T, C14408T and A23403G and two synonymous mutations T8782C, and C3037T. These five mutations occurred in almost all strains in April 2020. Besides, we also observed two potential dominant nonsynonymous mutations C1059T and G25563T, which occurred in most of the strains in April 2020. Further functional analysis shows that these mutations decreased protein stability largely, which could lead to a significant reduction of virus virulence. In addition, the A23403G mutation increases the spike-ACE2 interaction and finally leads to the enhancement of its infectivity. All of these proved that the evolution of SARS-CoV-2 is toward the enhancement of infectivity and reduction of virulence.


GigaScience ◽  
2021 ◽  
Vol 10 (1) ◽  
Author(s):  
Taras K Oleksyk ◽  
Walter W Wolfsberger ◽  
Alexandra M Weber ◽  
Khrystyna Shchubelka ◽  
Olga T Oleksyk ◽  
...  

Abstract Background The main goal of this collaborative effort is to provide genome-wide data for the previously underrepresented population in Eastern Europe, and to provide cross-validation of the data from genome sequences and genotypes of the same individuals acquired by different technologies. We collected 97 genome-grade DNA samples from consented individuals representing major regions of Ukraine that were consented for public data release. BGISEQ-500 sequence data and genotypes by an Illumina GWAS chip were cross-validated on multiple samples and additionally referenced to 1 sample that has been resequenced by Illumina NovaSeq6000 S4 at high coverage. Results The genome data have been searched for genomic variation represented in this population, and a number of variants have been reported: large structural variants, indels, copy number variations, single-nucletide polymorphisms, and microsatellites. To our knowledge, this study provides the largest to-date survey of genetic variation in Ukraine, creating a public reference resource aiming to provide data for medical research in a large understudied population. Conclusions Our results indicate that the genetic diversity of the Ukrainian population is uniquely shaped by evolutionary and demographic forces and cannot be ignored in future genetic and biomedical studies. These data will contribute a wealth of new information bringing forth a wealth of novel, endemic and medically related alleles.


2021 ◽  
Vol 21 (1) ◽  
Author(s):  
Olga V. Bondareva ◽  
Nadezhda A. Potapova ◽  
Kirill A. Konovalov ◽  
Tatyana V. Petrova ◽  
Natalia I. Abramson

Abstract Background Mitochondrial genes encode proteins involved in oxidative phosphorylation. Variations in lifestyle and ecological niche can be directly reflected in metabolic performance. Subterranean rodents represent a good model for testing hypotheses on adaptive evolution driven by important ecological shifts. Voles and lemmings of the subfamily Arvicolinae (Rodentia: Cricetidae) provide a good example for studies of adaptive radiation. This is the youngest group within the order Rodentia showing the fastest rates of diversification, including the transition to the subterranean lifestyle in several phylogenetically independent lineages. Results We evaluated the signatures of selection in the mitochondrial cytochrome b (cytB) gene in 62 Arvicolinae species characterized by either subterranean or surface-dwelling lifestyle by assessing amino acid sequence variation, exploring the functional consequences of the observed variation in the tertiary protein structure, and estimating selection pressure. Our analysis revealed that: (1) three of the convergent amino acid substitutions were found among phylogenetically distant subterranean species and (2) these substitutions may have an influence on the protein complex structure, (3) cytB showed an increased ω and evidence of relaxed selection in subterranean lineages, relative to non-subterranean, and (4) eight protein domains possess increased nonsynonymous substitutions ratio in subterranean species. Conclusions Our study provides insights into the adaptive evolution of the cytochrome b gene in the Arvicolinae subfamily and its potential implications in the molecular mechanism of adaptation. We present a framework for future characterizations of the impact of specific mutations on the function, physiology, and interactions of the mtDNA-encoded proteins involved in oxidative phosphorylation.


Author(s):  
João Pereira‐Vaz ◽  
Pedro Crespo ◽  
Luísa Mocho ◽  
Patrícia Martinho ◽  
Teresa Fidalgo ◽  
...  

Author(s):  
Theresia Devi Indriasari ◽  
Kusworo Anindito ◽  
Eddy Julianto ◽  
Bertha Laroha Paraya Pangaribuan

<span>Indonesia is a country located on top of some tectonic plates that bring potential natural disasters. Disaster management system is considered essential in controlling the situation in the site both before and after the disaster takes place. In disaster situation, the government and society are involved in a volunteer team in order to help minimize victims and support survivors. However, the volunteering activities are often hindered since there are problems in the disaster site. One of the problems is late responses due to poor coordination among volunteers that drives the delay in disaster relief. Therefore, it is necessary to have an application that maps the positions of volunteers in a disaster site, so that the disaster management coordinator can disseminate volunteers to disaster areas based on needs. The purpose of the study is to propose an application called ‘MyMapVolunteers’ that effectively and efficiently detects the position of the volunteers in order to improve disaster management service. In this case, real time and location based service technology will able to detect the position of each volunteer. ‘MyMapVolunteers’ is composed of two platforms, which are mobile and web applications. Mobile platform is an application that uses GPS function provided by the smartphone to find the volunteers’ location coordinates and then send the data of the location automatically and manually. The web platform is used to receive volunteers’ location data and to present them in google map, therefore disaster management coordinator can monitor the positions of and search for volunteers faster.</span>


Sign in / Sign up

Export Citation Format

Share Document