instance hardness
Recently Published Documents


TOTAL DOCUMENTS

11
(FIVE YEARS 6)

H-INDEX

2
(FIVE YEARS 1)

2021 ◽  
Author(s):  
Gustavo H. Nunes ◽  
Gustavo O. Martins ◽  
Carlos H. Q. Forster ◽  
Ana C. Lorena

Curriculum learning consists of training strategies for machine learning techniques in which the easiest observations are presented first, progressing into more difficult cases as training proceeds. For assembling the curriculum, it is necessary to order the observations a dataset has according to their difficulty. This work investigates how instance hardness measures, which can be used to assess the difficulty level of each observation in a dataset from different perspectives, can be used to assemble a curriculum. Experiments with four CIFAR-100 sub-problems have demonstrated the feasibility of using the instance hardness measures, the main advantage is on convergence speed and some datasets accuracy gains can also be verified.


2021 ◽  
Vol 11 (1) ◽  
Author(s):  
Sabit Ahmed ◽  
Afrida Rahman ◽  
Md. Al Mehedi Hasan ◽  
Shamim Ahmad ◽  
S. M. Shovan

AbstractIdentification of post-translational modifications (PTM) is significant in the study of computational proteomics, cell biology, pathogenesis, and drug development due to its role in many bio-molecular mechanisms. Though there are several computational tools to identify individual PTMs, only three predictors have been established to predict multiple PTMs at the same lysine residue. Furthermore, detailed analysis and assessment on dataset balancing and the significance of different feature encoding techniques for a suitable multi-PTM prediction model are still lacking. This study introduces a computational method named ’iMul-kSite’ for predicting acetylation, crotonylation, methylation, succinylation, and glutarylation, from an unrecognized peptide sample with one, multiple, or no modifications. After successfully eliminating the redundant data samples from the majority class by analyzing the hardness of the sequence-coupling information, feature representation has been optimized by adopting the combination of ANOVA F-Test and incremental feature selection approach. The proposed predictor predicts multi-label PTM sites with 92.83% accuracy using the top 100 features. It has also achieved a 93.36% aiming rate and 96.23% coverage rate, which are much better than the existing state-of-the-art predictors on the validation test. This performance indicates that ’iMul-kSite’ can be used as a supportive tool for further K-PTM study. For the convenience of the experimental scientists, ’iMul-kSite’ has been deployed as a user-friendly web-server at http://103.99.176.239/iMul-kSite.


Author(s):  
Naufal Azmi Verdikha ◽  
Teguh Bharata Adji ◽  
Adhistya Erna Permanasari

A text classification system is needed to address the problem of hate speech in social media. However, texts of hate speech are very hard to find in social media. This will make the distribution of training data to be unbalanced (imbalanced data). Classification with imbalanced data will make a poor performance. There are several methods to solve the problem of classification with imbalanced data. One of them is undersampling with Instance Hardness Threshold (IHT) method. IHT method balances the dataset by eliminating data that are frequently misclassified. To find those data, IHT requires an estimator, which is a classifier. This research aims to compare estimators of IHT method to solve imbalanced data problem in hate speech classification using TF-IDF weighting method. This research uses the class ratio of dataset after undersampling, time of the undersampling process, and Index of Balanced Accuracy (IBA) evaluation to determine the best IHT method. The results of this research show that IHT method using the Logistic Regression (IHT(LR)) has the fastest undersampling process (1.91 s), perfectly balance dataset with the class ratio is 1:1, and has the best of IBA evaluation in all estimation process. This result makes IHT(LR) be the best method to solve the imbalanced data problem in hate speech classification.


Author(s):  
Felipe N. Walmsley ◽  
George D. C. Cavalcanti ◽  
Dayvid V. R. Oliveira ◽  
Rafael M. O. Cruz ◽  
Robert Sabourin

Sign in / Sign up

Export Citation Format

Share Document