iterative scaling
Recently Published Documents


TOTAL DOCUMENTS

31
(FIVE YEARS 1)

H-INDEX

6
(FIVE YEARS 0)

2015 ◽  
Vol 42 (3) ◽  
pp. 832-847 ◽  
Author(s):  
Anna Klimova ◽  
Tamás Rudas


2013 ◽  
Vol 215 (1) ◽  
pp. 15-23 ◽  
Author(s):  
Erik Aas


Data Mining ◽  
2013 ◽  
pp. 1019-1042
Author(s):  
Pratibha Rani ◽  
Vikram Pudi

The rapid progress of computational biology, biotechnology, and bioinformatics in the last two decades has led to the accumulation of tremendous amounts of biological data that demands in-depth analysis. Data mining methods have been applied successfully for analyzing this data. An important problem in biological data analysis is to classify a newly discovered sequence like a protein or DNA sequence based on their important features and functions, using the collection of available sequences. In this chapter, we study this problem and present two Bayesian classifiers RBNBC (Rani & Pudi, 2008a) and REBMEC (Rani & Pudi, 2008c). The algorithms used in these classifiers incorporate repeated occurrences of subsequences within each sequence (Rani, 2008). Specifically, Repeat Based Naive Bayes Classifier (RBNBC) uses a novel formulation of Naive Bayes, and the second classifier, Repeat Based Maximum Entropy Classifier (REBMEC) uses a novel framework based on the classical Generalized Iterative Scaling (GIS) algorithm.



Author(s):  
HEE-DEOK YANG ◽  
HEUNG-IL SUK ◽  
SEONG-WHAN LEE

In this paper, a convergent method based on Generalized Iterative Scaling (GIS) with staggered Aitken acceleration is proposed to estimate the parameters for an on-line Conditional Random Field (CRF). The staggered Aitken acceleration method, which alternates between the acceleration and non-acceleration steps, ensures computational simplicity when analyzing incomplete data. The proposed method has the following advantages: (1) It can approximate parameters close to the empirical optimum in a single pass through the training examples; (2) It can reduce the computing time by approximating the Jacobian matrix of the mapping function and estimating the relation between the Jacobian and Hessian in order to replace the inverse of the objective function's Hessian matrix. We show the convergence of the penalized GIS based on the staggered Aitken acceleration method, compare its speed of convergence with those of other stochastic optimization methods, and illustrate experimental results with two public datasets.



Author(s):  
Pratibha Rani ◽  
Vikram Pudi

The rapid progress of computational biology, biotechnology, and bioinformatics in the last two decades has led to the accumulation of tremendous amounts of biological data that demands in-depth analysis. Data mining methods have been applied successfully for analyzing this data. An important problem in biological data analysis is to classify a newly discovered sequence like a protein or DNA sequence based on their important features and functions, using the collection of available sequences. In this chapter, we study this problem and present two Bayesian classifiers RBNBC (Rani & Pudi, 2008a) and REBMEC (Rani & Pudi, 2008c). The algorithms used in these classifiers incorporate repeated occurrences of subsequences within each sequence (Rani, 2008). Specifically, Repeat Based Naive Bayes Classifier (RBNBC) uses a novel formulation of Naive Bayes, and the second classifier, Repeat Based Maximum Entropy Classifier (REBMEC) uses a novel framework based on the classical Generalized Iterative Scaling (GIS) algorithm.



Sign in / Sign up

Export Citation Format

Share Document