component classifier
Recently Published Documents


TOTAL DOCUMENTS

20
(FIVE YEARS 2)

H-INDEX

4
(FIVE YEARS 1)

2021 ◽  
Vol 12 ◽  
Author(s):  
Samuel Anyaso-Samuel ◽  
Archie Sachdeva ◽  
Subharup Guha ◽  
Somnath Datta

Microbiome samples harvested from urban environments can be informative in predicting the geographic location of unknown samples. The idea that different cities may have geographically disparate microbial signatures can be utilized to predict the geographical location based on city-specific microbiome samples. We implemented this idea first; by utilizing standard bioinformatics procedures to pre-process the raw metagenomics samples provided by the CAMDA organizers. We trained several component classifiers and a robust ensemble classifier with data generated from taxonomy-dependent and taxonomy-free approaches. Also, we implemented class weighting and an optimal oversampling technique to overcome the class imbalance in the primary data. In each instance, we observed that the component classifiers performed differently, whereas the ensemble classifier consistently yielded optimal performance. Finally, we predicted the source cities of mystery samples provided by the organizers. Our results highlight the unreliability of restricting the classification of metagenomic samples to source origins to a single classification algorithm. By combining several component classifiers via the ensemble approach, we obtained classification results that were as good as the best-performing component classifier.


NeuroImage ◽  
2019 ◽  
Vol 198 ◽  
pp. 181-197 ◽  
Author(s):  
Luca Pion-Tonachini ◽  
Ken Kreutz-Delgado ◽  
Scott Makeig

Author(s):  
Qianying Wang ◽  
Ming Lu ◽  
Junhong Li

Semi-supervised boosting strategy aims at improving the performance of a given classifier with a multitude of unlabeled data. In semi-supervised boosting strategy, a similarity is needed to select unlabeled samples and then a pseudo label will be assigned to the unlabeled sample. A good similarity is helpful to assign a more proper pseudo label to unlabeled samples. Those selected samples with their pseudo labels will serve as labeled samples to train the new component classifier. So, similarity is important in semi-supervised boosting. Gaussian kernel similarity [Formula: see text] is used in semi-supervised boosting strategy. There are two drawbacks, first, the Euclidean distance [Formula: see text] cannot characterize the complicated relationship between the data samples; second, the parameter [Formula: see text] needs to set carefully. So, this paper proposes a novel adaptive similarity based on sparse representation for semi-supervised boosting. Our sparse representation is learned from a “clean” dictionary, which is a low rank matrix obtained from the sample matrix. We evaluate the proposed method on COIL20 databases. Experimental results show that: the semi-supervised boosting algorithm with sparse representation similarity outperforms the algorithm with Gaussian kernel similarity.


2017 ◽  
Vol 10 (26) ◽  
pp. 1-9
Author(s):  
S. Janarthanam ◽  
S. Sukumaran ◽  
M. Shanthakumar ◽  
◽  
◽  
...  

2015 ◽  
Vol 2015 ◽  
pp. 1-10
Author(s):  
Yanhuang Jiang ◽  
Qiangli Zhao ◽  
Yutong Lu

Combining several classifiers on sequential chunks of training instances is a popular strategy for data stream mining with concept drifts. This paper introduces human recalling and forgetting mechanisms into a data stream mining system and proposes a Memorizing Based Data Stream Mining (MDSM) model. In this model, each component classifier is regarded as a piece of knowledge that a human obtains through learning some materials and has a memory retention value reflecting its usefulness in the history. The classifiers with high memory retention values are reserved in a “knowledge repository.” When a new data chunk comes, most useful classifiers will be selected (recalled) from the repository and compose the current target ensemble. Based on MDSM, we put forward a new algorithm, MAE (Memorizing Based Adaptive Ensemble), which uses Ebbinghaus forgetting curve as the forgetting mechanism and adopts ensemble pruning as the recalling mechanism. Compared with four popular data stream mining approaches on the datasets with different concept drifts, the experimental results show that MAE achieves high and stable predicting accuracy, especially for the applications with recurring or complex concept drifts. The results also prove the effectiveness of MDSM model.


Sign in / Sign up

Export Citation Format

Share Document