scholarly journals FastMLST: A Multi-core Tool for Multilocus Sequence Typing of Draft Genome Assemblies

2021 ◽  
Vol 15 ◽  
pp. 117793222110592
Author(s):  
Enzo Guerrero-Araya ◽  
Marina Muñoz ◽  
César Rodríguez ◽  
Daniel Paredes-Sabja

Multilocus Sequence Typing (MLST) is a precise microbial typing approach at the intra-species level for epidemiologic and evolutionary purposes. It operates by assigning a sequence type (ST) identifier to each specimen, based on a combination of alleles of multiple housekeeping genes included in a defined scheme. The use of MLST has multiplied due to the availability of large numbers of genomic sequences and epidemiologic data in public repositories. However, data processing speed has become problematic due to the massive size of modern datasets. Here, we present FastMLST, a tool that is designed to perform PubMLST searches using BLASTn and a divide-and-conquer approach that processes each genome assembly in parallel. The output offered by FastMLST includes a table with the ST, allelic profile, and clonal complex or clade (when available), detected for a query, as well as a multi-FASTA file or a series of FASTA files with the concatenated or single allele sequences detected, respectively. FastMLST was validated with 91 different species, with a wide range of guanine-cytosine content (%GC), genome sizes, and fragmentation levels, and a speed test was performed on 3 datasets with varying genome sizes. Compared with other tools such as mlst, CGE/MLST, MLSTar, and PubMLST, FastMLST takes advantage of multiple processors to simultaneously type up to 28 000 genomes in less than 10 minutes, reducing processing times by at least 3-fold with 100% concordance to PubMLST, if contaminated genomes are excluded from the analysis. The source code, installation instructions, and documentation of FastMLST are available at https://github.com/EnzoAndree/FastMLST

2020 ◽  
Author(s):  
Enzo Guerrero-Araya ◽  
Marina Muñoz ◽  
César Rodríguez ◽  
Daniel Paredes-Sabja

ABSTRACTMultilocus Sequence Typing (MLST) is a precise microbial typing approach at the intra-species level for epidemiological and evolutionary purposes. It operates by assigning a sequence type (ST) identifier to each specimen, based on a combination of allelic sequences obtained for multiple housekeeping genes included in a defined scheme. The use of MLST has multiplied due to the availability of large numbers of genomic sequences and epidemiological data in public repositories. However, data processing speed has become problematic due to datasets’ massive size. Here, we present FastMLST, a tool that is designed to perform PubMLST searches using BLASTn and a divide-and-conquer approach. Compared with mlst, CGE/MLST, MLSTar, and PubMLST, FastMLST takes advantage of current multi-core computers to simultaneously type thousands of genome assemblies in minutes, reducing processing times by at least 4-fold and with more than 99.95% consistency.Availability and ImplementationThe source code, installation instructions and documentation are available at https://github.com/EnzoAndree/FastMLST


Agronomy ◽  
2021 ◽  
Vol 11 (11) ◽  
pp. 2298
Author(s):  
Manosh Kumar Biswas ◽  
Dhima Biswas ◽  
Mita Bagchi ◽  
Ganjun Yi ◽  
Guiming Deng

Microsatellites, or simple sequences repeat (SSRs), are distributed in genes, intergenic regions and transposable elements in the genome. SSRs were identified for developing markers from draft genome assemblies, transcriptome sequences and genome survey sequences in plant and animals. The identification, distribution, and density of microsatellites in pre-microRNAs (miRNAs) are not well documented in plants. In this study, SSRs were identified in 16,892 pre-miRNA sequences from 292 plant species in six taxonomic groups (algae to dicots). Fifty-one percent of pre-miRNA sequences contained SSRs. Mononucleotide repeats were the most abundant, followed by di- and trinucleotide repeats. Tetra-, penta-, and hexarepeats were rare. A total of 9,498 (57.46%) microsatellite loci had potential as pre-miRNA SSR markers. Of the markers, 3,573 (37.62%) were non-redundant, and 2,341 (65.51%) primer pairs could be transferred to at least one of the plant taxonomic groups. All data and primer pairs were deposited in a user-friendly, freely accessible plant miRNA SSR marker database. The data presented in this study, accelerate the understanding of pre-miRNA evolution and serve as valuable genomic treasure for genetic improvements in a wide range of crops, including legumes, cereals, and cruciferous crops.


2013 ◽  
Vol 2013 ◽  
pp. 1-10 ◽  
Author(s):  
Claudia Spampinato ◽  
Darío Leonardi

A wide range of molecular techniques have been developed for genotypingCandidaspecies. Among them, multilocus sequence typing (MLST) and microsatellite length polymorphisms (MLP) analysis have recently emerged. MLST relies on DNA sequences of internal regions of various independent housekeeping genes, while MLP identifies microsatellite instability. Both methods generate unambiguous and highly reproducible data. Here, we review the results achieved by using these two techniques and also provide a brief overview of a new method based on high-resolution DNA melting (HRM). This method identifies sequence differences by subtle deviations in sample melting profiles in the presence of saturating fluorescent DNA binding dyes.


1970 ◽  
Vol 19 (1-2) ◽  
pp. 264-267 ◽  
Author(s):  
F.H. Reuling ◽  
J.T. Schwartz

In the late 1950's and early 1960's, it became evident that some glaucoma patients developed elevations of intraocular pressure, which were difficult to control, following prolonged use of systemic or ocular medications containing corticosteroids (Chandler, 1955, Alfano, 1963; Armaly, 1963). In addition, some patients without glaucoma, when treated with steroids for long periods of time, developed clinical signs of chronic simple glaucoma (McLean, 1950; François, 1954; Covell, 1958; Linner, 1959; Goldman, 1962). Fortunately, the elevation of intraocular pressure was reversible if the drug was discontinued.Over the past decade, extensive investigation of the “steroid response” has been undertaken. For this presentation, the steroid response may be considered as a gradual elevation of intraocular pressure, occurring over several weeks, in an eye being medicated with corticosteroid drops several times a day. The elevation in pressure is usually accompanied by a reduction in the facility of aqueous outflow. When relatively large numbers of subjects were tested with topical steroids, so that a wide range of responsiveness could be observed, a variation in individual sensitivity was demonstrated. Frequency distributions of intraocular pressure or change in pressure following steroids showed a skew toward the high side. On the basis of trimodal characteristics which they observed in such frequency distributions, Becker and Hahn (1964), Becker (1965) and Armaly (1965, 1966) considered the possible existence of several genetically determined subpopulations. These investigators distinguished three subpopulations on the basis of low, intermediate, and high levels of pressure response. It was hypothesized that these levels of response characterized three phenotypes, corresponding to the three possible genotypes of an allele pair, wherein one member of the pair determined a low level of response, and the other member determined a high level of response (Armaly, 1967).


2021 ◽  
Vol 22 (1) ◽  
Author(s):  
Yance Feng ◽  
Lei M. Li

Abstract Background Normalization of RNA-seq data aims at identifying biological expression differentiation between samples by removing the effects of unwanted confounding factors. Explicitly or implicitly, the justification of normalization requires a set of housekeeping genes. However, the existence of housekeeping genes common for a very large collection of samples, especially under a wide range of conditions, is questionable. Results We propose to carry out pairwise normalization with respect to multiple references, selected from representative samples. Then the pairwise intermediates are integrated based on a linear model that adjusts the reference effects. Motivated by the notion of housekeeping genes and their statistical counterparts, we adopt the robust least trimmed squares regression in pairwise normalization. The proposed method (MUREN) is compared with other existing tools on some standard data sets. The goodness of normalization emphasizes on preserving possible asymmetric differentiation, whose biological significance is exemplified by a single cell data of cell cycle. MUREN is implemented as an R package. The code under license GPL-3 is available on the github platform: github.com/hippo-yf/MUREN and on the conda platform: anaconda.org/hippo-yf/r-muren. Conclusions MUREN performs the RNA-seq normalization using a two-step statistical regression induced from a general principle. We propose that the densities of pairwise differentiations are used to evaluate the goodness of normalization. MUREN adjusts the mode of differentiation toward zero while preserving the skewness due to biological asymmetric differentiation. Moreover, by robustly integrating pre-normalized counts with respect to multiple references, MUREN is immune to individual outlier samples.


2021 ◽  
Vol 11 (2) ◽  
Author(s):  
Suzanne V Saenko ◽  
Dick S J Groenenberg ◽  
Angus Davison ◽  
Menno Schilthuizen

Abstract Studies on the shell color and banding polymorphism of the grove snail Cepaea nemoralis and the sister taxon Cepaea hortensis have provided compelling evidence for the fundamental role of natural selection in promoting and maintaining intraspecific variation. More recently, Cepaea has been the focus of citizen science projects on shell color evolution in relation to climate change and urbanization. C. nemoralis is particularly useful for studies on the genetics of shell polymorphism and the evolution of “supergenes,” as well as evo-devo studies of shell biomineralization, because it is relatively easily maintained in captivity. However, an absence of genomic resources for C. nemoralis has generally hindered detailed genetic and molecular investigations. We therefore generated ∼23× coverage long-read data for the ∼3.5 Gb genome, and produced a draft assembly composed of 28,537 contigs with the N50 length of 333 kb. Genome completeness, estimated by BUSCO using the metazoa dataset, was 91%. Repetitive regions cover over 77% of the genome. A total of 43,519 protein-coding genes were predicted in the assembled genome, and 97.3% of these were functionally annotated from either sequence homology or protein signature searches. This first assembled and annotated genome sequence for a helicoid snail, a large group that includes edible species, agricultural pests, and parasite hosts, will be a core resource for identifying the loci that determine the shell polymorphism, as well as in a wide range of analyses in evolutionary and developmental biology, and snail biology in general.


Parasitology ◽  
1915 ◽  
Vol 8 (1) ◽  
pp. 11-16 ◽  
Author(s):  
L. E. Robinson

Variability in the size and, in a lesser degree, the taxonomic features of male ticks, has arrested the attention of all who have had occasion to examine moderately large numbers of examples of the same species. In the case of the female tick, this variability, though doubtless coextensive with that of the male, is more or less obscured by the wide range of variation in size, depending upon the degree of engorgement; and, also, by the fact that in the female tick the taxonomic characters are, as a rule, less pronounced. The present note is only concerned with variability in the size of the male.


2017 ◽  
Vol 114 (31) ◽  
pp. 8265-8270 ◽  
Author(s):  
Simon Olsson ◽  
Hao Wu ◽  
Fabian Paul ◽  
Cecilia Clementi ◽  
Frank Noé

Accurate mechanistic description of structural changes in biomolecules is an increasingly important topic in structural and chemical biology. Markov models have emerged as a powerful way to approximate the molecular kinetics of large biomolecules while keeping full structural resolution in a divide-and-conquer fashion. However, the accuracy of these models is limited by that of the force fields used to generate the underlying molecular dynamics (MD) simulation data. Whereas the quality of classical MD force fields has improved significantly in recent years, remaining errors in the Boltzmann weights are still on the order of a few kT, which may lead to significant discrepancies when comparing to experimentally measured rates or state populations. Here we take the view that simulations using a sufficiently good force-field sample conformations that are valid but have inaccurate weights, yet these weights may be made accurate by incorporating experimental data a posteriori. To do so, we propose augmented Markov models (AMMs), an approach that combines concepts from probability theory and information theory to consistently treat systematic force-field error and statistical errors in simulation and experiment. Our results demonstrate that AMMs can reconcile conflicting results for protein mechanisms obtained by different force fields and correct for a wide range of stationary and dynamical observables even when only equilibrium measurements are incorporated into the estimation process. This approach constitutes a unique avenue to combine experiment and computation into integrative models of biomolecular structure and dynamics.


1913 ◽  
Vol 17 (2) ◽  
pp. 117-131
Author(s):  
Hans Zinsser

The experiments recorded in this paper confirm the observations of Friedberger that acutely toxic bodies can be produced from typhoid bacilli by the action of sensitizer and complement and that, when small quantities of bacteria are used, an excess of sensitization either interferes with the formation of the poisons or leads to a cleavage of the bacterial proteid beyond the poisonous intermediate products spoken of as anaphylatoxins. Unlike the experience of other workers with poisons of this nature, however, our experiments have shown that the action of complement upon typhoid bacilli strongly sensitized or not at all sensitized may be carried on, at body temperature, for considerably longer than twelve hours without leading to a destruction of the poisons, and that this is true when the quantities of the bacteria used vary within the wide range of from one to twelve agar slants. It has been found, in fact, that in the case of this microorganism prolonged exposure at the higher temperature of considerable quantities of bacteria constitutes an unfailing method of regularly obtaining powerful poisons. The results obtained by the use of smaller quantities and the less vigorous complement action at low temperatures are far less regular or satisfactory. It would appear from this that complement action of considerable vigor is required to obtain from this bacillus any appreciable yield of anaphylatoxin, and that the poison, once formed, is not as unstable as that found in other microorganisms by Neufeld and Dold and others. In fact, although we have never observed complete lysis in vitro of the typhoid bacilli treated with antibody and complement, the sensitized bacteria exposed to the action of complement for as long as fifteen hours at 37.5° C. showed, in our experiments, much disintegration, and yet powerful poisons were present. Were the influence of lysis or of the too vigorous action of the serum bodies as rapidly poison-destroying in the case of this bacillus as it has been shown to be in the case of some other bacteria, it would be hard to understand how anaphylatoxins could play any part in the toxemia of typhoid fever. This phase of our experiments, however, seems to indicate that the conditions prevailing in the infected body at the height of this disease would furnish ideal criteria for anaphylatoxin production, since, in such cases, vigorously sensitized bacilli, in large numbers, are under the prolonged influence of considerable quantities of complement, conditions exactly comparable to those prevailing in our experiments. Granted that this state of affairs is actually the case, then the subsidence of the disease might depend merely upon limitation of the supply of antigen, as the increasing bactericidal action of the blood constituents come into play, and upon the consequent diminution of the anaphylatoxin. For as the bacteria diminish and the sensitizer increases, a changed proportion between them is established which, finally, as experiment has shown, results in a failure of anaphylatoxin production. For although our experiments have shown that, within a wide latitude of relative proportions of bacteria and antibody, anaphylatoxin can be formed, beyond this range an excess of one or the other element eventually will prevent their formation. It is not, however, the purpose of this paper to discuss the mechanism of the subsidence of the disease since this phase of the work will necessitate further experimental study. In regard to the experiments with kaolin, we were unable to confirm the contention of Keysser and Wassermann, though it is more than likely that toxic bodies could be formed by the action of complement upon any foreign proteid rendered amenable to its action. We are not inclined to attribute too much importance to these negative results, recording them merely as they occurred. However, should it be found subsequently that anaphylatoxins can be formed in this way, it seems unlikely that they are formed from the sensitizer or amboceptor as matrix, since this was not specifically adsorbed out of concentrated serum by the kaolin in our experiments. On the basis of experiments with so called endotoxins, ,we feel that the existence of such preformed intracellular poisons as an element in typhoid toxemia has not been proved, and is not absolutely necessary for the explanation of the phenomena occurring in this disease. However, the diarrhea, the hemorrhagic lesions, and the protracted symptoms following the injection of extracts and filtrates of the bacillus, differing so strikingly from the acute illness with rapid death or equally rapid recovery resulting from anaphylatoxin poisoning, would justify the assumption that poisons of this nature may still play a part in the disease, adding an additional specific characteristic to the clinical picture. As stated before, however, it is not improbable that all these characteristics may represent merely a more protracted or subacute state of anaphylatoxin toxemia. The experiments with autolysates, although none of them were fatal in their results upon guinea pigs, have sufficiently indicated that poisons comparable to anaphylatoxins can be formed in this way. This would indicate that a reaction of proteolysis, which may take place slowly by autolysis, is hastened by the action of complement, and its velocity is still further augmented by the increase, within certain limits, of the sensitization,—a conception which would attribute to the combined action of complement and sensitizer a function not incomparable to that of the bodies spoken of as catalytic agents.


2017 ◽  
Vol 119 (14) ◽  
pp. 1-7 ◽  
Author(s):  
Gary Natriello ◽  
Karen Zumwalt

The need for large numbers of individuals who can serve as effective teachers for the nation's young people has generated continuing interest in the recruitment, preparation, and retention of talented teachers for the past half-century, particularly since the civil rights and women's rights revolutions opened a wide range of career opportunities to many for whom teaching was historically one of the few fields available. Among the policy options under development in recent decades have been alternative routes into teaching, typically preparation experiences that differ in form and/or format from the established college-based certification programs. In this Teachers College Record Yearbook, we present the results of a longitudinal examination of one early alternative route program developed by the state of New Jersey. The New Jersey Provisional Teacher Program (or Alternate Route) is of particular interest both because it was the first of a generation of such programs created by various states in the final years of the 20th century and because its creation surfaced a range of issues and tensions that all the programs following in its wake have experienced.


Sign in / Sign up

Export Citation Format

Share Document