scholarly journals Mining Association Rules in Dengue Gene Sequence with Latent Periodicity

2015 ◽  
Vol 2015 ◽  
pp. 1-10 ◽  
Author(s):  
Marimuthu Thangam ◽  
Balamurugan Vanniappan

The mining of periodic patterns in dengue database is an interesting research problem that can be used for predicting the future evolution of dengue viruses. In this paper, we propose an algorithm called Recurrence Finder (RECFIN) that uses the suffix tree for detecting the periodic patterns of dengue gene sequence. Also, the RECFIN finds the presence of palindrome which indicates the possibilities of formation of proteins. Further, this paper computes the periodicity of nucleic acid and amino acid sequences of any length. The periodicity based association rules are used to diagnose the type of dengue. The time complexity of the proposed algorithm is O(n2). We demonstrate the effectiveness of the proposed approach by comparing the experimental results performed on dengue virus serotypes dataset with NCBI-BLAST algorithm.

Author(s):  
Wen-Chi Hou

Mining market basket data (Agrawal et al. 1993, Agrawal et al. 1994) has received a great deal of attention in the recent past, partly due to its utility and partly due to the research challenges it presents. Market basket data typically consists of store items purchased on a per-transaction basis, but it may also consist of items bought by a customer over a period of time. The goal is to discover buying patterns, such as two or more items that are often bought together. Such finding could aid in marketing promotions and customer relationship management. Association rules reflect a fundamental class of patterns that exist in the data. Consequently, mining association rules in market basket data has become one of the most important problems in data mining. Agrawal et al. (Agrawal, et al. 1993, Agrawal et al. 1994) have provided the initial foundation for this research problem. Since then, there has been considerable amount of work (Bayardo et al. 1999, Bayardo et al. 1999, Brin et al. 1997, Han et al. 2000, Park et al. 1995, Srikant et al. 1995, Srikant et al. 1997, Zaki et al. 1997, etc.) in developing faster algorithms to find association rules. While these algorithms may be different in their efficiency, they all use minsup (minimum support) and minconf (minimum confidence) as the criteria to determine the validity of the rules due to their simplicity and natural appeals. Few researchers (Brin et al. 1997, Aumann et al. 1999, Elder, 1999, Tan et al. 2002) have suspected the sufficiency of these criteria. On the other hand, Chi-squared test has been used widely in statistics related fields for independence test. In this research, we shall examine the rules derived based on the support-confidence framework (Agrawal et al. 1993, Agrawal et al. 1994) statistically by conducting Chi-squared tests. Our experimental results show that a surprising 30% of the rules fulfilling the minsup and minconf criteria are indeed insignificant statistically.


2004 ◽  
Vol 54 (3) ◽  
pp. 871-875 ◽  
Author(s):  
Matthias Wolf ◽  
Tobias Müller ◽  
Thomas Dandekar ◽  
J. Dennis Pollack

The phylogenetic position of the Mollicutes has been re-examined by using phosphoglycerate kinase (Pgk) amino acid sequences. Hitherto unpublished sequences from Mycoplasma mycoides subsp. mycoides, Mycoplasma hyopneumoniae and Spiroplasma citri were included in the analysis. Phylogenetic trees based on Pgk data indicated a monophyletic origin for the Mollicutes within the Firmicutes, whereas Bacilli (Firmicutes) and Clostridia (Firmicutes) appeared to be paraphyletic. With two exceptions, i.e. Thermotoga (Thermotogae) and Fusobacterium (Fusobacteria), which clustered within the Firmicutes, comparative analyses show that at a low taxonomic level, the resolved phylogenetic relationships that were inferred from both the Pgk protein and 16S rRNA gene sequence data are congruent.


Author(s):  
Valentina Pugacheva ◽  
Alexander Korotkov ◽  
Eugene Korotkov

AbstractThe aim of this study was to show that amino acid sequences have a latent periodicity with insertions and deletions of amino acids in unknown positions of the analyzed sequence. Genetic algorithm, dynamic programming and random weight matrices were used to develop a new mathematical algorithm for latent periodicity search. A multiple alignment of periods was calculated with help of the direct optimization of the position-weight matrix without using pairwise alignments. The developed algorithm was applied to analyze amino acid sequences of a small number of proteins. This study showed the presence of latent periodicity with insertions and deletions in the amino acid sequences of such proteins, for which the presence of latent periodicity was not previously known. The origin of latent periodicity with insertions and deletions is discussed.


2006 ◽  
Vol 71 (1) ◽  
pp. 18-31 ◽  
Author(s):  
V. P. Turutina ◽  
A. A. Laskin ◽  
N. A. Kudryashov ◽  
K. G. Skryabin ◽  
E. V. Korotkov

2014 ◽  
Vol 63 (11) ◽  
pp. 1411-1418 ◽  
Author(s):  
Bingqing Zhu ◽  
Yaochun Fan ◽  
Zheng Xu ◽  
Li Xu ◽  
Pengcheng Du ◽  
...  

The purpose of the present study was to identify the clonal characteristics and gyrA gene diversity of ciprofloxacin-resistant meningococcal strains in China. One hundred and forty-one ciprofloxacin-resistant and 103 ciprofloxacin-susceptible meningococcal strains were selected for multilocus sequence typing. Of these, 54 ciprofloxacin-resistant and 42 ciprofloxacin-susceptible strains were selected for gyrA gene sequencing. Of the three clonal complexes prevalent in China, serogroup A of ST-5 complex (CC5) and serogroup C/B strains of CC4821 had a high proportion of ciprofloxacin resistance, whereas CC11 serogroup W strains were all susceptible. Nucleotide and amino acid sequences of the gyrA gene among ciprofloxacin-resistant strains showed more diversity than those among ciprofloxacin-susceptible strains. All ciprofloxacin-resistant strains had a T91I mutation and the ciprofloxacin-susceptible strains had no T91I mutation. Phylogenetic analysis showed that the gyrA gene sequences of CC4821 serogroup B/C strains, CC11 serogroup W, CC1 serogroup A, ciprofloxacin-susceptible CC5 serogroup A and reference strains had high similarity. By contrast, the ciprofloxacin-resistant CC5 serogroup A strains had a highly conserved gyrA gene sequence which was different (94.8 % similarity) from that in the above strains. The results of our investigation showed that the high proportion of ciprofloxacin resistance in Neisseria meningitidis is associated with certain sequence types (STs) or clonal complexes (CCs). The prevalence of certain CCs with a high proportion of ciprofloxacin resistance can facilitate the spread of ciprofloxacin resistance.


Plant Disease ◽  
2008 ◽  
Vol 92 (6) ◽  
pp. 975-975 ◽  
Author(s):  
C. A. Baker ◽  
E. N. Rosskopf ◽  
M. S. Irey ◽  
L. Jones ◽  
S. Adkins

Ammi majus (bishop's weed), a member of the Apiaceae, is grown from seed for cut flowers in South Florida. In March 2005, plants were found to be showing virus-like symptoms including mosaic, vein clearing, and leaf rugosity (3) that rendered their flowers unmarketable. Inclusion morphology in epidermal strips from these infected plants indicated the presence of one or more potyviruses. This was confirmed by ELISA with commercially available antiserum for potyvirus identification (Agdia, Elkhart, IN). Clover yellow vein virus (ClYVV) was identified by sequencing and confirmed with specific antiserum (4). However, ClYVV was not identified in all potyvirus-infected samples from 2005, indicating the presence of one or more additional potyviruses. Bidens mottle virus (BiMoV) was subsequently identified in one of three potyvirus-infected samples by immunodiffusion tests using specific antiserum for BiMoV (Department of Plant Pathology, University of Florida), cylindrical inclusion morphology in epidermal strips, host range data, and sequencing of cloned reverse transcription (RT)-PCR products from degenerate potyvirus primers (2). Nucleotide and deduced amino acid sequences of a partial polyprotein gene sequence (GenBank Accession No. EU255631) were 95 and 98% identical, respectively, to a Florida isolate of BiMoV recently reported from tropical soda apple (1). Similar virus-like symptoms were again observed in A. majus in January 2007 and persisted through March. ELISA testing again indicated the presence of a potyvirus. However, neither ClYVV nor BiMoV were identified in the initial 2007 samples. Instead, sequence analysis of the cloned RT-PCR products amplified with degenerate potyvirus primers (2) from seven potyvirus-infected samples collected on two dates in January and one each in February and March revealed the presence of Apium virus Y (ApVY). The 3′ terminal portion of the genome (GenBank Accession No. EU255632) was found to be 90 to 91% identical to ApVY sequences in GenBank at the nucleotide level. Deduced amino acid sequences of the NIb and CP regions of these RT-PCR products were 96 and 95% identical, respectively, to ApVY sequences in GenBank. One of these seven ApVY-infected samples (collected in March 2007) was determined to be coinfected with BiMoV by sequence analysis of the cloned RT-PCR products. Six clones were sequenced. Three were determined to be ApVY as indicated above. Nucleotide and deduced amino acid sequences of a partial polyprotein gene sequence from the other three clones were 95 and 97% identical, respectively, to the 2005 A. majus BiMoV isolate. Although ClYVV and BiMoV have previously been reported in other hosts in Florida, to the best of our knowledge, this is the first report of BiMoV and ApVY in A. majus anywhere and the first report of ApVY in North America. References: (1.) C. A. Baker et al. Plant Dis. 91:905, 2007. (2.) A. Gibbs and A. J. Mackenzie. J. Virol. Methods 63:9, 1997. (3.) M. S. Irey et al. (Abstr.) Phytopathology (suppl.)95:S46, 2005. (4.) M. S. Irey et al. Plant Dis. 90:380, 2006.


Sign in / Sign up

Export Citation Format

Share Document