New Method for Sequence Similarity Analysis Based on the Position and Frequency of Statistically Significant Repeats

Current Bioinformatics ◽

10.2174/1574893616999210805165628 ◽

2021 ◽

Vol 16 ◽

Author(s):

Jasmina T. Jovanovic

Keyword(s):

Nucleotide Sequence ◽

Clustering Algorithm ◽

Sequence Similarity ◽

Random Sequence ◽

Nucleotide Sequences ◽

Similarity Analysis ◽

Frequency Method ◽

Nucleotide Sequence Similarity ◽

Alignment Free ◽

Multiple Data Sets

Background: The analysis of DNA nucleotide sequence similarity among different species is crucial in identifying their functional, structural or evolutionary relationships. The number of bioinformatics tools designed to perform the similarity analysis of nucleotide sequences has been growing rapidly. According to the current literature, alignment-free methods ha-ven’t been performed on nucleotide sequence repeats of different lengths. Objective: To develop a new algorithm for determining sequence characteristics and similarity based on statistically signifi-cant repetitive elements of different lengths, which are located in analyzed sequences. Method: This paper presents Repeats-Position/Frequency method (R-P/F method), for determining nucleotide sequence similarity which takes into consideration statistically significant repetitive parts of analyzed sequences. It is based on infor-mation theory and the fact that both position and frequency of repeated sequences are not expected to occur with the identical presence in a random sequence of the same length. Nucleotide sequences are presented in rn-dimensional vector space and their hierarchy is constructed by applying hierarchical clustering algorithm. Results: R-P/F method has been validated on multiple data sets of nucleotide sequences and compared with results obtained from alignment-based algorithms BLAST and Clustal Omega, and multiple well-established alignment-free dissimilarity measures. Presented method provides results comparable with other commonly used methods focused on resolving the same problem, with the new view on the used repetitive parts of sequences in these calculations. Conclusion: The presented, novel algorithm for calculating sequence similarity measure is effective in discovering relation-ships among the sequences and makes a powerful and complementary addition to existing sequence similarity methods.

Download Full-text

Functional promoter modules can be detected by formal models independent of overall nucleotide sequence similarity

Bioinformatics ◽

10.1093/bioinformatics/15.3.180 ◽

1999 ◽

Vol 15 (3) ◽

pp. 180-186 ◽

Cited By ~ 81

Author(s):

A Klingenhoff ◽

K Frech ◽

K Quandt ◽

T Werner

Keyword(s):

Nucleotide Sequence ◽

Sequence Similarity ◽

Formal Models ◽

Nucleotide Sequence Similarity

Download Full-text

Isolation of a RT-PCR fragment from human colon and sheep rumen RNA with nucleotide sequence similarity to human and rat urea transporter isoforms

Biochemical Society Transactions ◽

10.1042/bst026s122 ◽

1998 ◽

Vol 26 (2) ◽

pp. S122-S122 ◽

Cited By ~ 30

Author(s):

ARMIN RITZHAUPT ◽

I. STUART WOOD ◽

ALLAN A. JACKSON ◽

BRENDAN J. MORAN ◽

SORAYA P. SHIRAZI-BEECHEY

Keyword(s):

Nucleotide Sequence ◽

Sequence Similarity ◽

Human Colon ◽

Urea Transporter ◽

Rt Pcr ◽

Nucleotide Sequence Similarity ◽

Sheep Rumen

Download Full-text

An Open Reading Frame Downstream ofRhizobium meliloti nodQ1Shows Nucleotide Sequence Similarity to anAgrobacterium tumefaciensInsertion Sequence

Molecular Plant-Microbe Interactions ◽

10.1094/mpmi-7-0151 ◽

1994 ◽

Vol 7 (1) ◽

pp. 151 ◽

Cited By ~ 12

Author(s):

Julie Schwedock

Keyword(s):

Nucleotide Sequence ◽

Sequence Similarity ◽

Open Reading Frame ◽

Reading Frame ◽

Nucleotide Sequence Similarity ◽

Ofrhizobium Meliloti

Download Full-text

A novel alignment-free DNA sequence similarity analysis approach based on top-k n-gram match-up

Journal of Molecular Graphics and Modelling ◽

10.1016/j.jmgm.2020.107693 ◽

2020 ◽

Vol 100 ◽

pp. 107693

Author(s):

Emre Delibaş ◽

Ahmet Arslan ◽

Abdulkadir Şeker ◽

Banu Diri

Keyword(s):

Dna Sequence ◽

Sequence Similarity ◽

Analysis Approach ◽

Similarity Analysis ◽

Alignment Free ◽

Free Dna ◽

N Gram ◽

Sequence Similarity Analysis

Download Full-text

Partial Nucleotide Sequence Similarity within Species of Mycoplasma and Acholeplasma

Microbiology ◽

10.1099/00221287-121-2-333 ◽

1980 ◽

Vol 121 (2) ◽

pp. 333-338

Author(s):

W. M. SUGINO ◽

R. C. WEK ◽

D. T. KINGSBURY

Keyword(s):

Nucleotide Sequence ◽

Sequence Similarity ◽

Partial Nucleotide Sequence ◽

Nucleotide Sequence Similarity

Download Full-text

A Novel Method for Alignment-free DNA Sequence Similarity Analysis Based on the Characterization of Complex Networks

Evolutionary Bioinformatics ◽

10.4137/ebo.s40474 ◽

2016 ◽

Vol 12 ◽

pp. EBO.S40474 ◽

Cited By ~ 4

Author(s):

Jie Zhou ◽

Pianyu Zhong ◽

Tinghui Zhang

Keyword(s):

Complex Networks ◽

Dna Sequence ◽

Sequence Similarity ◽

Similarity Analysis ◽

Alignment Free ◽

Free Dna ◽

Novel Method ◽

Sequence Similarity Analysis

Download Full-text

Trichothecene Nonproducer Gibberella Species Have Both Functional and Nonfunctional 3-O-Acetyltransferase Genes

Genetics ◽

10.1093/genetics/163.2.677 ◽

2003 ◽

Vol 163 (2) ◽

pp. 677-684

Author(s):

Makoto Kimura ◽

Takeshi Tokai ◽

Gentaro Matsumoto ◽

Makoto Fujimura ◽

Hiroshi Hamamoto ◽

...

Keyword(s):

Nucleotide Sequence ◽

Fission Yeast ◽

Fusarium Graminearum ◽

Sequence Similarity ◽

Fusarium Species ◽

Fungal Genome ◽

Expression Cloning ◽

Nucleotide Sequence Similarity ◽

Cdna Expression ◽

Pcr Techniques

Abstract The trichothecene 3-O-acetyltransferase gene (FgTri101) required for trichothecene production by Fusarium graminearum is located between the phosphate permease gene (pho5) and the UTP-ammonia ligase gene (ura7). We have cloned and sequenced the pho5-to-ura7 regions from three trichothecene nonproducing Fusarium (i.e., F. oxysporum, F. moniliforme, and Fusarium species IFO 7772) that belong to the teleomorph genus Gibberella. BLASTX analysis of these sequences revealed portions of predicted polypeptides with high similarities to the TRI101 polypeptide. While FspTri101 (Fusarium species Tri101) coded for a functional 3-O-acetyltransferase, FoTri101 (F. oxysporum Tri101) and FmTri101 (F. moniliforme Tri101) were pseudogenes. Nevertheless, F. oxysporum and F. moniliforme were able to acetylate C-3 of trichothecenes, indicating that these nonproducers possess another as yet unidentified 3-O-acetyltransferase gene. By means of cDNA expression cloning using fission yeast, we isolated the responsible FoTri201 gene from F. oxysporum; on the basis of this sequence, FmTri201 has been cloned from F. moniliforme by PCR techniques. Both Tri201 showed only a limited level of nucleotide sequence similarity to FgTri101 and FspTri101. The existence of Tri101 in a trichothecene nonproducer suggests that this gene existed in the fungal genome before the divergence of producers from nonproducers in the evolution of Fusarium species.

Download Full-text

Genome Sequences of 19 Rhodococcus erythropolis Cluster CA Phages

Genome Announcements ◽

10.1128/genomea.01201-17 ◽

2017 ◽

Vol 5 (49) ◽

Cited By ~ 2

Author(s):

J. Alfred Bonilla ◽

Sharon Isern ◽

Ann M. Findley ◽

Karen K. Klyczek ◽

Scott F. Michael ◽

...

Keyword(s):

Nucleotide Sequence ◽

Complete Genome ◽

Environmental Samples ◽

Sequence Similarity ◽

Rhodococcus Erythropolis ◽

Genome Sequences ◽

Content Type ◽

Nucleotide Sequence Similarity

ABSTRACT We report the complete genome sequences of 19 cluster CA bacteriophages isolated from environmental samples using Rhodococcus erythropolis as a host. All of the phages are Siphoviridae, have similar genome lengths (46,314 to 46,985 bp) and G+C contents (58.5 to 58.8%), and share nucleotide sequence similarity.

Download Full-text

Molecular characterization of kudoid parasites (Myxozoa: Multivalvulida) from somatic muscles of Pacific bluefin (Thunnus orientalis) and yellowfin (T. albacores) tuna

Acta Parasitologica ◽

10.2478/s11686-013-0130-1 ◽

2013 ◽

Vol 58 (2) ◽

Cited By ~ 8

Author(s):

Niichiro Abe ◽

Tomofumi Maehara

Keyword(s):

Nucleotide Sequence ◽

Sequence Similarity ◽

Food Poisoning ◽

Yellowfin Tuna ◽

Thunnus Orientalis ◽

Public Health Importance ◽

Nucleotide Sequence Similarity ◽

Marine Fishery ◽

Raw Foods ◽

28S Rdna Sequence

AbstractThe public health importance of Kudoa infection in fish remains unclear. Recently in Japan a Kudoa species, K. septempunctata, was newly implicated as a causative agent of unidentified food poisoning related to the consumption of raw olive flounder. Other marine fishery products are also suspected as causative raw foods of unidentified food poisoning. For this study, we detected kudoid parasites from sliced raw muscle tissues of a young Pacific bluefin and an adult yellowfin tuna. No cyst or pseudocyst was evident in muscles macroscopically, but pseudocysts were detected in both samples histologically. One substitution (within 1100 bp overlap) and ten substitutions (within 753 bp overlap) were found respectively between the partial sequences of 18S and 28S rDNAs from both isolates. Nucleotide sequence similarity searching of 18S and 28S rDNAs from both isolates showed the highest identity with those of K. neothunni from tuna. Based on the spore morphology, the mode of parasitism, and the nucleotide sequence similarity, these isolates from a Pacific bluefin and a yellowfin tuna were identified as K. neothunni. Phylogenetic analysis of the 28S rDNA sequence revealed that K. neothunni is classifiable into two genotypes: one from Pacific bluefin and the other from yellowfin tuna. Recently, an unidentified kudoid parasite morphologically and genetically similar K. neothunni were detected from stocked tuna samples in unidentified food poisoning cases in Japan. The possibility exists that K. neothunni, especially from the Pacific bluefin tuna, causes food poisoning, as does K. septempunctata.

Download Full-text

Identification of Rhizoctonia solani associated with soybean in Brazil by rDNA-ITS sequences

Fitopatologia Brasileira ◽

10.1590/s0100-41582003000400011 ◽

2003 ◽

Vol 28 (4) ◽

pp. 413-419 ◽

Cited By ~ 15

Author(s):

Roseli C. Fenille ◽

Maísa B. Ciampi ◽

Eiko E. Kuramae ◽

Nilton L. Souza

Keyword(s):

Nucleotide Sequence ◽

Rhizoctonia Solani ◽

Gene Sequence ◽

Sequence Similarity ◽

5.8S Rdna ◽

Nucleotide Sequence Similarity ◽

Rdna Its ◽

Foliar Blight ◽

Rdna Its Sequences ◽

Its1 And Its2

The aim of this study was to identify isolates of Rhizoctonia solani causing hypocotyl rot and foliar blight in soybean (Glycine max) in Brazil by the nucleotide sequences of ITS-5.8S regions of rDNA. The 5.8S rDNA gene sequence (155 bp) was highly conserved among all isolates but differences in length and nucleotide sequence of the ITS1 and ITS2 regions were observed between soybean isolates and AG testers. The similarity of the nucleotide sequence among AG-1 IA isolates, causing foliar blight, was 95.1-100% and 98.5-100% in the ITS1 and ITS2 regions, respectively. The nucleotide sequence similarity among subgroups IA, IB and IC ranged from 84.3 to 89% in ITS1 and from 93.3 to 95.6% in ITS2. Nucleotide sequence similarity of 99.1% and 99.3-100% for ITS1 and ITS2, respectively, was observed between AG-4 soybean isolates causing hypocotyl rots and the AG-4 HGI tester. The similarity of the nucleotide sequence of the ITS-5.8S rDNA region confirmed that the R. solani Brazilian isolates causing foliar blight are AG-1 IA and isolates causing hypocotyl rot symptoms are AG-4 HGI. The ITS-5.8S rDNA sequence was not determinant for the identification of the AG-2-2 IIIB R. solani soybean isolate.

Download Full-text