scholarly journals GLUE: A flexible software system for virus sequence data

2018 ◽  
Author(s):  
Joshua B Singer ◽  
Emma C Thomson ◽  
John McLauchlan ◽  
Joseph Hughes ◽  
Robert J Gifford

AbstractBackgroundVirus genome sequences, generated in ever-higher volumes, can provide new scientific insights and inform our responses to epidemics and outbreaks. To facilitate interpretation, such data must be organised and processed within scalable computing resources that encapsulate virology expertise. GLUE (Genes Linked by Underlying Evolution) is a data-centric bioinformatics environment for building such resources. The GLUE core data schema organises sequence data along evolutionary lines, capturing not only nucleotide data but associated items such as alignments, genotype definitions, genome annotations and motifs. Its flexible design emphasises applicability to different viruses and to diverse needs within research, clinical or public health contexts.ResultsHCV-GLUE is a case study GLUE resource for hepatitis C virus (HCV). It includes an interactive public web application providing sequence analysis in the form of a maximum-likelihood-based genotyping method, antiviral resistance detection and graphical sequence visualisation. HCV sequence data from GenBank is categorised and stored in a large-scale sequence alignment which is accessible via web-based queries. Whereas this web resource provides a range of basic functionality, the underlying GLUE project can also be downloaded and extended by bioinformaticians addressing more advanced questions.ConclusionGLUE can be used to rapidly develop virus sequence data resources with public health, research and clinical applications. This streamlined approach, with its focus on reuse, will help realise the full value of virus sequence data.

Author(s):  
Joshua Singer ◽  
Robert Gifford ◽  
Matthew Cotten ◽  
David Robertson

Summary CoV-GLUE is an online web application for the interpretation and analysis of SARS-CoV-2 virus genome sequences, with a focus on amino acid sequence variation. It is based on the GLUE data-centric bioinformatics environment and provides a browsable database of amino acid replacements and coding region indels that have been observed in sequences from the pandemic. Users may also analyse their own SARS-CoV-2 sequences by submitting them to the web application to receive an interactive report containing visualisations of phylogenetic classification and highlighting genomic variation of potentially high impact, for example linked to primer mismatches.Availability and implementation Available at http://cov-glue.cvr.gla.ac.uk. Implemented using GLUE, an open source framework for the development of virus sequence data resources. Contact [email protected]


2017 ◽  
Author(s):  
James Hadfield ◽  
Colin Megill ◽  
Sidney M. Bell ◽  
John Huddleston ◽  
Barney Potter ◽  
...  

AbstractSummaryUnderstanding the spread and evolution of pathogens is important for effective public health measures and surveillance. Nextstrain consists of a database of viral genomes, a bioinformatics pipeline for phylodynamics analysis, and an interactive visualisation platform. Together these present a real-time view into the evolution and spread of a range of viral pathogens of high public health importance. The visualization integrates sequence data with other data types such as geographic information, serology, or host species. Nextstrain compiles our current understanding into a single accessible location, publicly available for use by health professionals, epidemiologists, virologists and the public alike.Availability and implementationAll code (predominantly JavaScript and Python) is freely available from github.com/nextstrain and the web-application is available at nextstrain.org.


Author(s):  
Tao Yan ◽  
Yao Yao ◽  
Dezhi Wu ◽  
Lixi Jiang

Abstract Rapeseed (Brassica napus L.) is a typical polyploid crop and one of the most important oilseed crops worldwide. With the rapid progress on high-throughput sequencing technologies and the reduction of sequencing cost, large-scale genomic data of a specific crop have become available. However, raw sequence data are mostly deposited in the sequence read archive of the National Center of Biotechnology Information (NCBI) and the European Nucleotide Archive (ENA), which is freely accessible to all researchers. Extensive tools for practical purposes should be developed to efficiently utilize these large raw data. Here, we report a web-based rapeseed genomic variation database (BnaGVD, http://rapeseed.biocloud.net/home) from which genomic variations, such as single nucleotide polymorphisms (SNPs) and insertions/deletions (InDels) across a world-wide collection of rapeseed accessions, can be referred. The current release of the BnaGVD contains 34,591,899 high-quality SNPs and 12,281,923 high-quality InDels and provides search tools to retrieve genomic variations and gene annotations across 1,007 accessions of worldwide rapeseed germplasm. We implement a variety of built-in tools (e.g., BnaGWAS, BnaPCA, and BnaStructure) to help users perform in-depth analyses. We recommend this web resource for accelerating studies on the functional genomics and screening of molecular markers for rapeseed breeding.


2012 ◽  
Vol 93 (10) ◽  
pp. 2195-2203 ◽  
Author(s):  
Martha I. Nelson ◽  
Marie R. Gramer ◽  
Amy L. Vincent ◽  
Edward C. Holmes

To determine the extent to which influenza viruses jump between human and swine hosts, we undertook a large-scale phylogenetic analysis of pandemic A/H1N1/09 (H1N1pdm09) influenza virus genome sequence data. From this, we identified at least 49 human-to-swine transmission events that occurred globally during 2009–2011, thereby highlighting the ability of the H1N1pdm09 virus to transmit repeatedly from humans to swine, even following adaptive evolution in humans. Similarly, we identified at least 23 separate introductions of human seasonal (non-pandemic) H1 and H3 influenza viruses into swine globally since 1990. Overall, these results reveal the frequency with which swine are exposed to human influenza viruses, indicate that humans make a substantial contribution to the genetic diversity of influenza viruses in swine, and emphasize the need to improve biosecurity measures at the human–swine interface, including influenza vaccination of swine workers.


2016 ◽  
Vol 55 (4) ◽  
pp. 256-263 ◽  
Author(s):  
Janet Klara Djomba ◽  
Lijana Zaletel-Kragelj

Abstract Introduction Research on social networks in public health focuses on how social structures and relationships influence health and health-related behaviour. While the sociocentric approach is used to study complete social networks, the egocentric approach is gaining popularity because of its focus on individuals, groups and communities. Methods One of the participants of the healthy lifestyle health education workshop ‘I’m moving’, included in the study of social support for exercise was randomly selected. The participant was denoted as the ego and members of her/his social network as the alteri. Data were collected by personal interviews using a self-made questionnaire. Numerical methods and computer programmes for the analysis of social networks were used for the demonstration of analysis. Results The size, composition and structure of the egocentric social network were obtained by a numerical analysis. The analysis of composition included homophily and homogeneity. Moreover, the analysis of the structure included the degree of the egocentric network, the strength of the ego-alter ties and the average strength of ties. Visualisation of the network was performed by three freely available computer programmes, namely: Egonet.QF, E-net and Pajek. The computer programmes were described and compared by their usefulness. Conclusion Both numerical analysis and visualisation have their benefits. The decision what approach to use is depending on the purpose of the social network analysis. While the numerical analysis can be used in large-scale population-based studies, visualisation of personal networks can help health professionals at creating, performing and evaluation of preventive programmes, especially if focused on behaviour change.


2018 ◽  
Vol 35 (14) ◽  
pp. 2489-2491 ◽  
Author(s):  
Tobias Rausch ◽  
Markus Hsi-Yang Fritz ◽  
Jan O Korbel ◽  
Vladimir Benes

Abstract Summary Harmonizing quality control (QC) of large-scale second and third-generation sequencing datasets is key for enabling downstream computational and biological analyses. We present Alfred, an efficient and versatile command-line application that computes multi-sample QC metrics in a read-group aware manner, across a wide variety of sequencing assays and technologies. In addition to standard QC metrics such as GC bias, base composition, insert size and sequencing coverage distributions it supports haplotype-aware and allele-specific feature counting and feature annotation. The versatility of Alfred allows for easy pipeline integration in high-throughput settings, including DNA sequencing facilities and large-scale research initiatives, enabling continuous monitoring of sequence data quality and characteristics across samples. Alfred supports haplo-tagging of BAM/CRAM files to conduct haplotype-resolved analyses in conjunction with a variety of next-generation sequencing based assays. Alfred’s companion web application enables interactive exploration of results and comparison to public datasets. Availability and implementation Alfred is open-source and freely available at https://tobiasrausch.com/alfred/. Supplementary information Supplementary data are available at Bioinformatics online.


Author(s):  
Lawrence T. Brown ◽  
Ashley Bachelder ◽  
Marisela B. Gomez ◽  
Alicia Sherrell ◽  
Imani Bryan

Academic institutions are increasingly playing pivotal roles in economic development and community redevelopment in cities around the United States. Many are functioning in the role of anchor institutions and building technology, biotechnology, or research parks to facilitate biomedical research. In the process, universities often partner with local governments, implementing policies that displace entire communities and families, thereby inducing a type of trauma that researcher Mindy Thompson Fullilove has termed “root shock.” We argue that displacement is a threat to public health and explore the ethical implications of university-led displacement on public health research, especially the inclusion of vulnerable populations into health-related research. We further explicate how the legal system has sanctioned the exercise of eminent domain by private entities such as universities and developers.Strategies that communities have employed in order to counter such threats are highlighted and recommended for communities that may be under the threat of university-led displacement. We also offer a critical look at the three dominant assumptions underlying university-sponsored development: that research parks are engines of economic development, that deconcentrating poverty via displacement is effective, and that poverty is simply the lack of economic or financial means. Understanding these fallacies will help communities under the threat of university-sponsored displacement to protect community wealth, build power, and improve health.


Sign in / Sign up

Export Citation Format

Share Document