sequence comparisons
Recently Published Documents


TOTAL DOCUMENTS

1095
(FIVE YEARS 146)

H-INDEX

84
(FIVE YEARS 6)

PLoS ONE ◽  
2021 ◽  
Vol 16 (11) ◽  
pp. e0260413
Author(s):  
Francesco Saverio Tarantini ◽  
Mara Brunati ◽  
Anna Taravella ◽  
Lucia Carrano ◽  
Francesco Parenti ◽  
...  

As part of a screening programme for antibiotic-producing bacteria, a novel Actinomadura species was discovered from a soil sample collected in Santorini, Greece. Preliminary 16S rRNA gene sequence comparisons highlighted Actinomadura macra as the most similar characterised species. However, whole-genome sequencing revealed an average nucleotide identity (ANI) value of 89% with A. macra, the highest among related species. Further phenotypic and chemotaxonomic analyses confirmed that the isolate represents a previously uncharacterised species in the genus Actinomadura, for which the name Actinomadura graeca sp. nov. is proposed (type strain 32-07T). The G+C content of A. graeca 32–07 is 72.36%. The cell wall contains DL-diaminopimelic acid, intracellular sugars are glucose, ribose and galactose, the predominant menaquinone is MK-9(H6), the major cellular lipid is phosphatidylinositol and fatty acids consist mainly of hexadecanoic acid. No mycolic acid was detected. Furthermore, A. graeca 32–07 has been confirmed as a novel producer of the non-ribosomal peptide antibiotic zelkovamycin and we report herein a provisional description of the unique biosynthetic gene cluster.


2021 ◽  
Author(s):  
Ari Löytynoja

Variation within human genomes is distributed unevenly and variants show spatial clustering. DNA-replication related template switching is a poorly known mutational mechanism capable of causing major chromosomal rearrangements as well as creating short inverted sequence copies that appear as local mutation clusters in sequence comparisons. We reanalyzed haplotype-resolved genome assemblies representing 25 human populations and multinucleotide variants aggregated from 140,000 human sequencing experiments. We found local template switching to explain thousands of complex mutation clusters across the human genome, the loci segregating within and between populations with a small number appearing as de novo mutations. We developed computational tools for genotyping candidate template switch loci using short-read sequencing data and for identification of template switch events using both short-read data and genotype data. These tools will enable building a catalogue of affected loci and studying the cellular mechanisms behind template switching both in healthy organisms and in disease. Strikingly, we noticed that widely-used analysis pipelines for short-read sequencing data - capable of identifying single nucleotide changes - may miss TSM-origin inversions of tens of base pairs, potentially invalidating medical genetic studies searching for causative alleles behind genetic diseases.


Author(s):  
Pieter-Jan Kerkhof ◽  
Stephen L. W. On ◽  
Kurt Houf

A study on the polyphasic taxonomic classification of an Arcobacter strain, R-73987T, isolated from the rectal mucus of a porcine intestinal tract, was performed. Phylogenetic analysis based on the 16S rRNA gene sequence revealed that the strain could be assigned to the genus Arcobacter and suggested that strain R-73987T belongs to a novel undescribed species. Comparative analysis of the rpoB gene sequence confirmed the findings. Arcobacter faecis LMG 28519T was identified as its closest neighbour in a multigene analysis based on 107 protein- encoding genes. Further, whole-genome sequence comparisons by means of average nucleotide identity and in silico DNA–DNA hybridization between the genome of strain R-73987T and the genomes of validly named Arcobacter species resulted in values below 95–96 and 70  %, respectively. In addition, a phenotypic analysis further corroborated the conclusion that strain R-73987T represents a novel Arcobacter species, for which the name Arcobacter vandammei sp. nov. is proposed. The type strain is R-73987T (=LMG 31429T=CCUG 75005T). This appears to be the first Arcobacter species recovered from porcine intestinal mucus.


2021 ◽  
Author(s):  
Michael Heinzinger ◽  
Maria Littmann ◽  
Ian Sillitoe ◽  
Nicola Bordin ◽  
Christine Orengo ◽  
...  

Thanks to the recent advances in protein three-dimensional (3D) structure prediction, in particular through AlphaFold 2 and RoseTTAFold, the abundance of protein 3D information will explode over the next year(s). Expert resources based on 3D structures such as SCOP and CATH have been organizing the complex sequence-structure-function relations into a hierarchical classification schema. Experimental structures are leveraged through multiple sequence alignments, or more generally through homology-based inference (HBI) transferring annotations from a protein with experimentally known annotation to a query without annotation. Here, we presented a novel approach that expands the concept of HBI from a low-dimensional sequence-distance lookup to the level of a high-dimensional embedding-based annotation transfer (EAT). Secondly, we introduced a novel solution using single protein sequence representations from protein Language Models (pLMs), so called embeddings (Prose, ESM-1b, ProtBERT, and ProtT5), as input to contrastive learning, by which a new set of embeddings was created that optimized constraints captured by hierarchical classifications of protein 3D structures. These new embeddings (dubbed ProtTucker) clearly improved what was historically referred to as threading or fold recognition. Thereby, the new embeddings enabled the intrusion into the midnight zone of protein comparisons, i.e., the region in which the level of pairwise sequence similarity is akin of random relations and therefore is hard to navigate by HBI methods. Cautious benchmarking showed that ProtTucker reached much further than advanced sequence comparisons without the need to compute alignments allowing it to be orders of magnitude faster. Code is available at https://github.com/Rostlab/EAT .


Agronomy ◽  
2021 ◽  
Vol 11 (11) ◽  
pp. 2297
Author(s):  
Mark J. Quinton-Tulloch ◽  
Katherine A. Steele

Plant resistance genes (R-genes) drive the immune responses of crops against specific pathotypes of disease-causing organisms. Over time, genetic diversity in R-genes and R-pseudogenes has arisen among different rice varieties. This bioinformatics study was carried out to (i) predict the full sets of candidate nucleotide-binding site leucine-rich repeat (NLR) R-genes present in six rice genomes; (ii) detect variation within candidate R-genes; (iii) identify potential selectable markers within and near to LRR genes among 75 diverse indica rice genomes. Four high quality indica genomes, plus the standard japonica and indica reference genomes, were analysed with widely available bioinformatic tools to identify candidate R-genes and R-pseudogenes. They were detected in clusters, consistent with previous studies. BLAST analysis of cloned protein sequences of 31 R-gene loci gave confidence in this approach for detection of cloned NLR R-genes. Approximately 10% of candidate R-genes were located within 1 kb of a microsatellite (SSR) marker. Sequence comparisons among indica rice genomes detected SNPs or InDels in 334 candidate rice R-genes. There were significantly more SNPs and InDels within the identified NLR R-gene candidates than in other types of gene. The genome-wide locations of candidate R-genes and their associated markers are presented here for the potential future development of improved disease-resistant varieties. Limitations of in silico approaches used for R-gene discovery are discussed.


2021 ◽  
Author(s):  
Evan R Stark-Dykema ◽  
Eden A. Dulka ◽  
Emma R Gerlinger ◽  
Jacob L Mueller

Mammalian sex chromosomes are enriched for large, nearly-identical, palindromic sequences harboring genes expressed predominately in testicular germ cells. Discerning if individual palindrome-associated gene families are essential for male reproduction is difficult due to challenges in disrupting all copies within a gene family. Here we generate precise, independent, deletions to assess the reproductive roles of two X-linked palindromic gene families with spermatid-predominant expression, 4930567H17Rik or Mageb5. Via sequence comparisons, we find mouse 4930567H17Rik and Mageb5 have human orthologs, 4930567H17Rik is rapidly diverging in rodents and primates, and 4930567H17Rik is harbored in a palindrome in humans and mice, while Mageb5 is not. Mice lacking either 4930567H17Rik or Mageb5 gene families do not have detectable defects in male fertility, fecundity, spermatogenesis, or in gene regulation, but do show differences in sperm head morphology, suggesting a potential role in sperm function. We conclude that while all palindrome-associated gene families are not essential for male fertility, large palindromes influence the evolution of their associated gene families.


2021 ◽  
Vol 118 (46) ◽  
pp. e2107335118
Author(s):  
Jiangfeng Zhao ◽  
Hao Xie ◽  
Ahmad Reza Mehdipour ◽  
Schara Safarian ◽  
Ulrich Ermler ◽  
...  

Multidrug and toxic compound extrusion (MATE) transporters are widespread in all domains of life. Bacterial MATE transporters confer multidrug resistance by utilizing an electrochemical gradient of H+ or Na+ to export xenobiotics across the membrane. Despite the availability of X-ray structures of several MATE transporters, a detailed understanding of the transport mechanism has remained elusive. Here we report the crystal structure of a MATE transporter from Aquifex aeolicus at 2.0-Å resolution. In light of its phylogenetic placement outside of the diversity of hitherto-described MATE transporters and the lack of conserved acidic residues, this protein may represent a subfamily of prokaryotic MATE transporters, which was proven by phylogenetic analysis. Furthermore, the crystal structure and substrate docking results indicate that the substrate binding site is located in the N bundle. The importance of residues surrounding this binding site was demonstrated by structure-based site-directed mutagenesis. We suggest that Aq_128 is functionally similar but structurally diverse from DinF subfamily transporters. Our results provide structural insights into the MATE transporter, which further advances our global understanding of this important transporter family.


Author(s):  
Zilong Zhang ◽  
Danlei Liu ◽  
Zilei Zhang ◽  
Peng Tian ◽  
Shenwei Li ◽  
...  

AbstractNorovirus is recognized as one of the leading causes of acute gastroenteritis outbreaks. Genotype GII.9 was first detected in Norfolk, VA, USA, in 1997. However, the complete genome sequence of this genotype has not yet been determined. In this study, a complete genome sequence of GII.9[P7] norovirus, SCD1878_GII.9[P7], from a patient was determined using high-throughput sequencing and rapid amplification of cDNA ends (RACE) technology. The complete genome sequence of SCD1878_GII.9[P7] is 7544 nucleotides (nt) in length with a 3’ poly(A) tail and contains three open reading frames. Sequence comparisons indicated that SCD1878_GII.9[P7] shares 92.1%-92.3% nucleotide sequence identity with GII.P7 (AB258331 and AB039777) and 96.7%-97.4% identity with GII.9 (AY038599 and DQ379715). The results suggested that SCD1878_GII.9[P7] is a member of P genotype GII.P7 and G genotype GII.9. This viral sequence fills a gap at the whole-genome level for the GII.9 genotype.


2021 ◽  
Author(s):  
Kristoffer Sahlin

k-mer-based methods are widely used in bioinformatics for various types of sequence comparisons. However, a single mutation will mutate k consecutive k-mers and make most k-mer-based applications for sequence comparison sensitive to variable mutation rates. Many techniques have been studied to overcome this sensitivity, for example, spaced k-mers and k-mer permutation techniques, but these techniques do not handle indels well. For indels, pairs or groups of small k-mers are commonly used, but these methods first produce k-mer matches, and only in a second step, a pairing or grouping of k-mers is performed. Such techniques produce many redundant k-mer matches owing to the size of k. Here, we propose strobemers as an alternative to k-mers for sequence comparison. Intuitively, strobemers consist of two or more linked shorter k-mers, where the combination of linked k-mers is decided by a hash function. We use simulated data to show that strobemers provide more evenly distributed sequence matches and are less sensitive to different mutation rates than k-mers and spaced k-mers. Strobemers also produce higher match coverage across sequences. We further implement a proof-of-concept sequence-matching tool StrobeMap and use synthetic and biological Oxford Nanopore sequencing data to show the utility of using strobemers for sequence comparison in different contexts such as sequence clustering and alignment scenarios.


Plant Disease ◽  
2021 ◽  
Author(s):  
Xing Ma ◽  
Jessie Brazil ◽  
Hannah M Rivedal ◽  
Keith L. Perry ◽  
Kenneth Frost ◽  
...  

Potato (Solanum tuberosum cv. Norkotah) tubers with symptoms of soft rot were submitted to Oregon State University, Hermiston Agricultural Research and Extension Center Plant Clinic in 2019. One submission in May, originated from a field with poor emergence and seed piece decay (~20% affected) in Umatilla County, Oregon. The second submission, in September, originated from a field in Washington. From each submission, ~100 mg tissue at the margin of infection was washed with distilled water, excised, macerated in 500 L sterile distilled water for 5 minutes. The resulting solution was streaked on crystal violet pectate (CVP) medium and incubated at 28°C for 24 hours. One colony, representative of the many white colonies that formed depressions on CVP plates, was isolated from each submission. Bacterial isolates from Oregon and Washington were named JB56A and JB133A, respectively, and preserved in Luria-Bertani (LB) broth with 15% glycerol at -80°C for long-term storage. Genomic DNA was extracted from JB56A and JB133A cultures grown in LB broth overnight at 30°C using the Wizard SV Genomic DNA kit. The partial dnaX gene (537 bp) was amplified from genomic DNA of each isolate using dnaXf/dnaXr primers (Slawiak et al. 2009) and sequenced. These sequences were deposited to the NCBI GenBank Database, accession numbers MW930747 (JB56A) and MW930748 (JB133A). BLAST analyses (Altschul et al. 1990) using default parameters indicated that the dnaX sequences of JB56A and JB133A were 99.2% (533/537) and 98.7% (530/537) identical to that of P. versatile SCC1 (CP021894). A condensed maximum likelihood tree was built using the partial dnaX sequence of the two query strains, twelve Pectobacterium reference strains to include all known species of Pectobacterium, and four Dickeya species as an outgroup (Fig. S1). JB56A and JB133A formed a monophyletic clade with P. versatile SCC1. Potato (cv. Upstate Abundance) tuber and stem bioassays (Ma et al. 2018) were conducted twice to assess the pathogenicity of these isolates. Tubers were wounded with a sterile 2 mm wide wooden applicator stick and 5 μl culture grown in LB broth overnight (~109 CFU) was pipetted into the wound. Tubers were incubated at 29°C for 24 hours and cut through puncture sites to observe symptoms. Stems of four- or five-week-old plants were wounded with a sterile toothpick about 10 cm above the soil line and a smear of JB56A or JB133A grown on LB agar was inserted into the wound using a toothpick and incubated in a greenhouse for 72 hours. Positive controls (D. dianthicola ME23) and negative controls (no bacteria) were included in both assays. Tubers and stems exhibited disease symptoms after 24 and 72 hours, respectively, following inoculation with JB56A, JB133A, and D. dianthicola ME23. No symptoms were observed for negative controls. The identity of bacteria re-isolated from the margin of stem lesions was confirmed by partial dnaX sequence analyses. P. versatile was recently described as a distinct species based on whole genome sequence comparisons (Portier et al. 2019). In 2018, we isolated P. versatile from potato stems with blackleg disease in New York, and a recent study found that it was isolated in the US from an iris in 1946 (Ma et al. 2021; Portier et al. 2019). However, the geographic distribution and importance of this pathogen in the US remains largely unknown. To our knowledge, this is the first report of potato soft rot caused by P. versatile in Oregon and Washington, two important potato producing states.


Sign in / Sign up

Export Citation Format

Share Document