scholarly journals Adapting Supervised Classification Algorithms to Arbitrary Weak Label Scenarios

Author(s):  
Miquel Perelló-Nieto ◽  
Raúl Santos-Rodríguez ◽  
Jesús Cid-Sueiro
2021 ◽  
Author(s):  
jorge cabrera Alvargonzalez ◽  
Ana Larranaga Janeiro ◽  
Sonia Perez ◽  
Javier Martinez Torres ◽  
Lucia martinez lamas ◽  
...  

Severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) has been and remains one of the major challenges humanity has faced thus far. Over the past few months, large amounts of information have been collected that are only now beginning to be assimilated. In the present work, the existence of residual information in the massive numbers of rRT-PCRs that tested positive out of the almost half a million tests that were performed during the pandemic is investigated. This residual information is believed to be highly related to a pattern in the number of cycles that are necessary to detect positive samples as such. Thus, a database of more than 20,000 positive samples was collected, and two supervised classification algorithms (a support vector machine and a neural network) were trained to temporally locate each sample based solely and exclusively on the number of cycles determined in the rRT-PCR of each individual. Finally, the results obtained from the classification show how the appearance of each wave is coincident with the surge of each of the variants present in the region of Galicia (Spain) during the development of the SARS-CoV-2 pandemic and clearly identified with the classification algorithm.


Author(s):  
Tobias Scheffer

For many classification problems, unlabeled training data are inexpensive and readily available, whereas labeling training data imposes costs. Semi-supervised classification algorithms aim at utilizing information contained in unlabeled data in addition to the (few) labeled data.


2020 ◽  
Vol 37 (4) ◽  
pp. 723-739 ◽  
Author(s):  
Anton Sokolov ◽  
Egor Dmitriev ◽  
Cyril Gengembre ◽  
Hervé Delbarre

AbstractThe problem is considered of atmospheric meteorological events’ classification, such as sea breezes, fogs, and high winds, in coastal areas. In situ wind, temperature, humidity, pressure, radiance, and turbulence meteorological measurements are used as predictors. Local atmospheric events of 2013–14 were analyzed and classified manually using data of the measurement campaign in the coastal area of the English Channel in Dunkirk, France. The results of that categorization allowed the training of a few supervised classification algorithms using the data of an ultrasonic anemometer as predictors. The comparison was carried out for the K-nearest-neighbors classifier, support vector machine, and two Bayesian classifiers—quadratic discriminant analysis and Parzen–Rozenblatt window. The analysis showed that the K-nearest-neighbors and quadratic discriminant analysis classifiers reveal the best classification accuracy (up to 80% correctly classified meteorological events). The latter classifier has higher calculation speed and is less sensitive to unbalanced data and the overtraining problem. The most informative atmospheric parameters for events recognition were revealed for each algorithm. The results obtained showed that supervised classification algorithms contribute to automation of processing and analyzing of local meteorological measurements.


2018 ◽  
Vol 180 (13) ◽  
pp. 42-48
Author(s):  
Michael Sam ◽  
Mark Herol ◽  
Chrizel Marie ◽  
Roldan B. ◽  
Romulo L.

Author(s):  
Rogers Prates De Pelle ◽  
Viviane P. Moreira

Brazilian Web users are among the most active in social networks and very keen on interacting with others. Offensive comments, known as hate speech, have been plaguing online media and originating a number of lawsuits against companies which publish Web content. Given the massive number of user generated text published on a daily basis, manually filtering offensive comments becomes infeasible. The identification of offensive comments can be treated as a supervised classification task. In order to obtain a model to classify comments, an annotated dataset containing positive and negative examples is necessary. The lack of such a dataset in Portuguese, limits the development of detection approaches for this language. In this paper, we describe how we created annotated datasets of offensive comments for Portuguese by collecting news comments on the Brazilian Web. In addition, we provide classification results achieved by standard classification algorithms on these datasets which can serve as baseline for future work on this topic.


Sign in / Sign up

Export Citation Format

Share Document