Applying Gaussian Distribution-Dependent Criteria to Decision Trees for High-Dimensional Microarray Data

Author(s):  
Raymond Wan ◽  
Ichigaku Takigawa ◽  
Hiroshi Mamitsuka
Author(s):  
Miguel Garcia-Torres ◽  
Francisco Gomez-Vela ◽  
David Becerra-Alonso ◽  
Belen Melian-Batista ◽  
J. Marcos Moreno-Vega

Author(s):  
Zhenqiu Liu ◽  
Feng Jiang ◽  
Guoliang Tian ◽  
Suna Wang ◽  
Fumiaki Sato ◽  
...  

In this paper, we propose a novel method for sparse logistic regression with non-convex regularization Lp (p <1). Based on smooth approximation, we develop several fast algorithms for learning the classifier that is applicable to high dimensional dataset such as gene expression. To the best of our knowledge, these are the first algorithms to perform sparse logistic regression with an Lp and elastic net (Le) penalty. The regularization parameters are decided through maximizing the area under the ROC curve (AUC) of the test data. Experimental results on methylation and microarray data attest the accuracy, sparsity, and efficiency of the proposed algorithms. Biomarkers identified with our methods are compared with that in the literature. Our computational results show that Lp Logistic regression (p <1) outperforms the L1 logistic regression and SCAD SVM. Software is available upon request from the first author.


2021 ◽  
Vol 8 (2) ◽  
pp. 257-272
Author(s):  
Yunai Yi ◽  
Diya Sun ◽  
Peixin Li ◽  
Tae-Kyun Kim ◽  
Tianmin Xu ◽  
...  

AbstractThis paper presents an unsupervised clustering random-forest-based metric for affinity estimation in large and high-dimensional data. The criterion used for node splitting during forest construction can handle rank-deficiency when measuring cluster compactness. The binary forest-based metric is extended to continuous metrics by exploiting both the common traversal path and the smallest shared parent node.The proposed forest-based metric efficiently estimates affinity by passing down data pairs in the forest using a limited number of decision trees. A pseudo-leaf-splitting (PLS) algorithm is introduced to account for spatial relationships, which regularizes affinity measures and overcomes inconsistent leaf assign-ments. The random-forest-based metric with PLS facilitates the establishment of consistent and point-wise correspondences. The proposed method has been applied to automatic phrase recognition using color and depth videos and point-wise correspondence. Extensive experiments demonstrate the effectiveness of the proposed method in affinity estimation in a comparison with the state-of-the-art.


Sign in / Sign up

Export Citation Format

Share Document