scholarly journals VADR: validation and annotation of virus sequence submissions to GenBank

2019 ◽  
Author(s):  
Alejandro A Schäffer ◽  
Eneida L Hatcher ◽  
Linda Yankie ◽  
Lara Shonkwiler ◽  
J Rodney Brister ◽  
...  

AbstractBackgroundGenBank contains over 3 million viral sequences. The National Center for Biotechnology Information (NCBI) previously made available a tool for validating and annotating influenza virus sequences that is used to check submissions to GenBank. Before this project, there was no analogous tool in use for non-influenza viral sequence submissions.ResultsWe developed a system called VADR (Viral Annotation DefineR) that validates and annotates viral sequences in GenBank submissions. The annotation system is based on the analysis of the input nucleotide sequence using models built from curated RefSeqs. Hidden Markov models are used to classify sequences by determining the RefSeq they are most similar to, and feature annotation from the RefSeq is mapped based on a nucleotide alignment of the full sequence to a covariance model. Predicted proteins encoded by the sequence are validated with nucleotide-to-protein alignments using BLAST. The system identifies 43 types of “alerts” that (unlike the previous BLAST-based system) provide deterministic and rigorous feedback to researchers who submit sequences with unexpected characteristics. VADR has been integrated into GenBank’s submission processing pipeline allowing for viral submissions passing all tests to be accepted and annotated automatically, without the need for any human (GenBank indexer) intervention. Unlike the previous submission-checking system, VADR is freely available (https://github.com/nawrockie/vadr) for local installation and use. VADR has been used for Norovirus submissions since May 2018 and for Dengue virus submissions since January 2019. Other viruses with high numbers of submissions will be added incrementally.ConclusionVADR improves the speed with which non-flu virus submissions to GenBank can be checked and improves the content and quality of the GenBank annotations. The availability and portability of the software allow researchers to run the GenBank checks prior to submitting their viral sequences, and thereby gain confidence that their submissions will be accepted immediately without the need to correspond with GenBank staff. Reciprocally, the adoption of VADR frees GenBank staff to spend more time on services other than checking routine viral sequence submissions.

2017 ◽  
Vol 114 (31) ◽  
pp. 8265-8270 ◽  
Author(s):  
Simon Olsson ◽  
Hao Wu ◽  
Fabian Paul ◽  
Cecilia Clementi ◽  
Frank Noé

Accurate mechanistic description of structural changes in biomolecules is an increasingly important topic in structural and chemical biology. Markov models have emerged as a powerful way to approximate the molecular kinetics of large biomolecules while keeping full structural resolution in a divide-and-conquer fashion. However, the accuracy of these models is limited by that of the force fields used to generate the underlying molecular dynamics (MD) simulation data. Whereas the quality of classical MD force fields has improved significantly in recent years, remaining errors in the Boltzmann weights are still on the order of a few kT, which may lead to significant discrepancies when comparing to experimentally measured rates or state populations. Here we take the view that simulations using a sufficiently good force-field sample conformations that are valid but have inaccurate weights, yet these weights may be made accurate by incorporating experimental data a posteriori. To do so, we propose augmented Markov models (AMMs), an approach that combines concepts from probability theory and information theory to consistently treat systematic force-field error and statistical errors in simulation and experiment. Our results demonstrate that AMMs can reconcile conflicting results for protein mechanisms obtained by different force fields and correct for a wide range of stationary and dynamical observables even when only equilibrium measurements are incorporated into the estimation process. This approach constitutes a unique avenue to combine experiment and computation into integrative models of biomolecular structure and dynamics.


Author(s):  
Maria Alessandra Montironi ◽  
Harry H. Cheng

Being able to correctly assess the context it is currently acting in is a very important ability for every autonomous robot performing a task in a real world scenario such as navigating, manipulating an object or interacting with a user. Sensors are the primary interface with the external world and the means through which contextual knowledge is generated. Humans and animals use cognitive processes such as attention to selectively process perceived task-relevant information and to recognize the context they are currently acting in. Biologically inspired computational models of attention have been developed in recent years to be used as interpretation keys of mainly visual sensor data. This paper presents a new framework for situation assessment that expands existing computational models of attention by providing a unified methodology to interpret and combine data from different sources. The method utilizes probabilistic state estimation techniques such as Bayesian recursive estimation, Kalman filter, and hidden Markov models to interpret features extracted from sensor data and formulate hypotheses about different aspects of the task the robot is performing or of the environment it is currently acting in. The concept of Bayesian surprise is also used to mark the information content of each new hypothesis. A weight that takes into account the confidence in the estimate that generated the hypothesis, its information content, and the quality of the data is then calculated. The methodology presented in this paper is general and allows to consistently apply the framework to data from different types of sensors and to then combine their hypotheses. Once formulated, hypotheses can then be used for context-based reasoning and plan adaptation. The framework was implemented on a small two-wheel differential drive robot equipped with a camera, an ultrasonic and two infrared range sensors. Three different sets of results that evaluate the performance of different features of the framework are presented. First, the method has been applied to detect a target object and to distinguish it from similar objects. Second, the hypotheses strength calculation method has been characterized by isolating the effect of belief, surprise, and of the quality of the data. Third, the combination of hypotheses from different modules has been evaluated in the context of environment classification.


2019 ◽  
Vol 2019 ◽  
pp. 1-13
Author(s):  
Olugbenga Emmanuel Imole ◽  
Tom Mmbasu Walingo

Signals transmitted via satellite networks at high frequency in the Ka, Q, and V bands are susceptible to degradation due to rain attenuation. Adaptive transmission techniques are usually employed to mitigate the effect of rain and improve users’ quality of service (QoS) but the effectiveness of these techniques hinges on the accuracy with which rain attenuation on the link is known. Commonly, most techniques rely on predicted attenuation along the link for selection of optimal transmission parameters. This paper proposes an efficient approach to predict the rain attenuation experienced by sources of multimedia connections in rain-impacted satellite networks. The proposed technique is based on three Markov models for widespread, shower, and thunderstorm rain events and predicts the attenuation experienced at different periods within the duration of a user’s connection. It relies on an adaptive modulation and coding (AMC) scheme to dynamically mitigate rain attenuation and a call admission control (CAC) policy to guarantee the satisfaction of users’ QoS requirements.


Author(s):  
Carla Mavian ◽  
Simone Marini ◽  
Mattia Prosperi ◽  
Marco Salemi

AbstractThe SARS-CoV-2 pandemic has been growing exponentially, affecting nearly 900 thousand people and causing enormous distress to economies and societies worldwide. A plethora of analyses based on viral sequences has already been published, in scientific journals as well as through non-peer reviewed channels, to investigate SARS-CoV-2 genetic heterogeneity and spatiotemporal dissemination. We examined full genome sequences currently available to assess the presence of sufficient information for reliable phylogenetic and phylogeographic studies in countries with the highest toll of confirmed cases. Although number of-available full-genomes is growing daily, and the full dataset contains sufficient phylogenetic information that would allow reliable inference of phylogenetic relationships, country-specific SARS-CoV-2 datasets still present severe limitations. Studies assessing within country spread or transmission clusters should be considered preliminary at best, or hypothesis generating. Hence the need for continuing concerted efforts to increase number and quality of the sequences required for robust tracing of the epidemic.Significance StatementAlthough genome sequences of SARS-CoV-2 are growing daily and contain sufficient phylogenetic information, country-specific data still present severe limitations and should be interpreted with caution.


Author(s):  
Ione Ayala Gualandi de Oliveira ◽  
Rosângela Caetano ◽  
Ricardo Ewback Steffen ◽  
Aline Navega Biz

Abstract Objective: To synthesize the available evidence and state of the art of economic evaluations which evaluate the use of memantine, whether alone or combined with donepezil, for moderate to severe Alzheimer’s disease (AD), focusing on the analytical decision models built. Method: The electronic databases MEDLINE, EMBASE, NHS EED, CEA Registry and LILACS were searched for references. After duplicates were removed, two independent reviewers evaluated the titles and abstracts and subsequently the full texts. The Drummond M. tool was used to evaluate the quality of the studies. Results: After the application of the eligibility criteria, twelve complete economic evaluations were included. One evaluation was a clinical trial, two involved simulations and nine used Markov models. The main outcome measure adopted was dominated by cost per quality adjusted life year (QALY). The use of memantine was considered cost-effective and dominant in eight studies; while in a single study, its use was dominated when compared to donepezil for moderate AD. Sensitivity analyzes were systematically performed, with robust results. The quality assessment indicated that the methodological quality of the studies was good. Conclusion: Although there is some controversy regarding the benefits derived from the use of memantine, whether combined or not with donepezil, the evidence collected suggests that it is cost-effective in the countries where the studies were performed. However, local economic studies need to be performed, given the significant variability derived from the different parameters adopted in the evaluations.


2010 ◽  
Vol 58 (4) ◽  
pp. 673-681
Author(s):  
W. Oniszczuk

Loss tandem networks with blocking - a semi-Markov approachBased on the semi-Markov process theory, this paper describes an analytical study of a loss multiple-server two-station network model with blocking. Tasks arrive to the tandem in a Poisson fashion at a rate λ, and the service times at the first and second stations are non-exponentially distributed with means sAand sB, respectively. Between these two stations there is a buffer with finite capacity. In this type of network, if the buffer is full, the accumulation of new tasks (jobs) by the second station is temporarily suspended (blocking factor) and tasks must wait on the first station until the transmission process is resumed. Any new task that finds all service lines at the first station occupied is turned away and is lost (loss factor). Initially, in this document, a Markov model of the loss tandem with blocking is investigated. Here, a two-dimensional state graph is constructed and a set of steady-state equations is created. These equations allow the calculation of state probabilities for each graph state. A special algorithm for transforming the Markov model into a semi-Markov process is presented. This approach allows calculating steady-state probabilities in the semi-Markov model. In the next part of the paper, the algorithms for calculation of the main measures of effectiveness in the semi-Markov model are presented. Finally, the numerical part of this paper contains an investigation of some special semi-Markov models, where the results are presented of the calculation of the quality of service (QoS) parameters and the main measures of effectiveness.


2018 ◽  
Vol 32 (6) ◽  
pp. 1070-1081
Author(s):  
Francesca Bassi

PurposeThe purpose of this paper is to measure students’ satisfaction with the didactics in a large Italian university, that of Padua, giving special attention to its evolution over time in consecutive academic years. The overall level of the quality of the didactics is examined and its change over time is modeled. Moreover, the effect of courses’ and teachers’ variables on it is estimated.Design/methodology/approachLatent cluster class models and mixture latent class Markov models are estimated in order to identify groups of courses that are homogeneous for the level of the quality of the didactics. Evolution over the three academic years of satisfaction is monitored. The effect on the clustering and its dynamics of potential covariates is also examined.FindingsResults of model estimation reveal some interesting evidences that are important indications for the university management to define targeted strategies to elevate teaching quality.Originality/valueThe paper gives its original contribution both on the side of methods applied to analyze data collected with students evaluation of teaching and on the evidences obtained for a large university.


2009 ◽  
Vol 54 (3) ◽  
pp. S26-S31 ◽  
Author(s):  
Andre Silvanovich ◽  
Gary Bannon ◽  
Scott McClain
Keyword(s):  

2012 ◽  
Vol 2012 ◽  
pp. 1-18
Author(s):  
Karima Adel-Aissanou ◽  
Karim Abbas ◽  
Djamil Aïssani

Markov models are frequently used for performance modeling. However most models do not have closed form solutions, and numerical solutions are often not feasible due to the large or even infinite state space of models of practical interest. For that, the state-space truncation is often demanded for computation of this kind of models. In this paper, we use the strong stability approach to establish analytic error bounds for the truncation of a tandem queue with blocking. Numerical examples are carried out to illustrate the quality of the obtained error bounds.


Plant Disease ◽  
2011 ◽  
Vol 95 (8) ◽  
pp. 901-906 ◽  
Author(s):  
J. Karkashian ◽  
E. D. Ramos-Reynoso ◽  
D. P. Maxwell ◽  
P. Ramírez

Begomovirus spp. cause substantial losses in bean crops in tropical and subtropical regions of the Americas. The predominant Begomovirus sp. in Central America associated with golden mosaic symptoms in bean is Bean golden yellow mosaic virus (BGYMV). However, Calopogonium golden mosaic virus was previously found to infect bean crops in the northern region of Costa Rica. The objective of this research was to identify Begomovirus spp. that infect bean plants in different geographical regions of Nicaragua. In all, 126 samples of young bean leaves with symptoms of golden mosaic were collected from eight different regions of Nicaragua. Using DNA hybridization with specific probes, 120 samples tested positive for BGYMV, 14 samples tested positive for Squash yellow mild mottle virus, and 7 samples tested positive for Calopogonium golden mosaic virus. Sequence analysis of polymerase chain reaction-amplified products from three samples (MA-9 Managua, BE-8 Rivas, and SO-9 Granada) also indicated that the symptoms of golden mosaic in bean are associated with viral sequences from three different Begomovirus spp. Management of bean golden mosaic disease must take into account that BGYMV is the predominant virus (95% of the samples) and that 12% of the samples exhibited possible mixed infections or recombination events in the south and central geographical regions of Nicaragua.


Sign in / Sign up

Export Citation Format

Share Document