statistical classifiers
Recently Published Documents


TOTAL DOCUMENTS

101
(FIVE YEARS 17)

H-INDEX

19
(FIVE YEARS 2)

Sensors ◽  
2022 ◽  
Vol 22 (2) ◽  
pp. 483
Author(s):  
Tomasz Czarnecki ◽  
Kacper Bloch

The subject of this work is the analysis of methods of detecting soiling of photovoltaic panels. Environmental and weather conditions affect the efficiency of renewable energy sources. Accumulation of soil, dust, and dirt on the surface of the solar panels reduces the power generated by the panels. This paper presents several variants of the algorithm that uses various statistical classifiers to classify photovoltaic panels in terms of soiling. The base material was high-resolution photos and videos of solar panels and sets dedicated to solar farms. The classifiers were tested and analyzed in their effectiveness in detecting soiling. Based on the study results, a group of optimal classifiers was defined, and the classifier selected that gives the best results for a given problem. The results obtained in this study proved experimentally that the proposed solution provides a high rate of correct detections. The proposed innovative method is cheap and straightforward to implement, and allows use in most photovoltaic installations.


Sensors ◽  
2021 ◽  
Vol 21 (22) ◽  
pp. 7718
Author(s):  
Olaf Bar ◽  
Łukasz Bibrzycki ◽  
Michał Niedźwiecki ◽  
Marcin Piekarczyk ◽  
Krzysztof Rzecki ◽  
...  

Reliable tools for artefact rejection and signal classification are a must for cosmic ray detection experiments based on CMOS technology. In this paper, we analyse the fitness of several feature-based statistical classifiers for the classification of particle candidate hits in four categories: spots, tracks, worms and artefacts. We use Zernike moments of the image function as feature carriers and propose a preprocessing and denoising scheme to make the feature extraction more efficient. As opposed to convolution neural network classifiers, the feature-based classifiers allow for establishing a connection between features and geometrical properties of candidate hits. Apart from basic classifiers we also consider their ensemble extensions and find these extensions generally better performing than basic versions, with an average recognition accuracy of 88%.


Author(s):  
Surangika Ranathunga ◽  
Isuru Udara Liyanage

Sinhala is a low-resource language, for which basic language and linguistic tools have not been properly defined. This affects the development of NLP-based end-user applications for Sinhala. Thus, when implementing NLP tools such as sentiment analyzers, we have to rely only on language-independent techniques. This article presents the use of such language-independent techniques in implementing a sentiment analysis system for Sinhala news comments. We demonstrate that for low-resource languages such as Sinhala, the use of recently introduced word embedding models as semantic features can compensate for the lack of well-developed language-specific linguistic or language resources, and text classification with acceptable accuracy is indeed possible using both traditional statistical classifiers and Deep Learning models. The developed classification models, a corpus of 8.9 million tokens extracted from Sinhala news articles and user comments, and Sinhala Word2Vec and fastText word embedding models are now available for public use; 9,048 news comments annotated with POSITIVE/NEGATIVE/NEUTRAL polarities have also been released.


2021 ◽  
Author(s):  
Kyana Y Garza ◽  
Alex Ap Rosini Silva ◽  
Jonas R Rosa ◽  
Michael F Keating ◽  
Sydney C Povilaitis ◽  
...  

The outbreak of COVID-19 has created an unprecedent global crisis. While PCR is the gold standard method for detecting active SARS-CoV-2 infection, alternative high-throughput diagnostic tests are of significant value to meet universal testing demands. Here, we describe a new design of the MasSpec Pen technology integrated to electrospray ionization (ESI) for direct analysis of clinical swabs and investigate its use for COVID-19 screening. The redesigned MasSpec Pen system incorporates a disposable sampling device refined for uniform and efficient analysis of swab tips via liquid extraction directly coupled to a ESI source. Using this system, we analyzed nasopharyngeal swabs from 244 individuals including symptomatic COVID-19 positive, symptomatic negative, and asymptomatic negative individuals, enabling rapid detection of rich lipid profiles. Two statistical classifiers were generated based on the lipid information aquired. Classifier 1 was built to distinguish symptomatic PCR-positive from asymptomatic PCR-negative individuals, yielding cross-validation accuracy of 83.5%, sensitivity of 76.6%, and specificity of 86.6%, and validation set accuracy of 89.6%, sensitivity of 100%, and specificity of 85.3%. Classifier 2 was built to distinguish symptomatic PCR-positive patients from negative individuals including symptomatic PCR-negative patients with moderate to severe symptoms and asymptomatic individuals, yielding a cross-validation accuracy of 78.4% accuracy, specificity of 77.21%, and sensitivity of 81.8%. Collectively, this study suggests that the lipid profiles detected directly from nasopharyngeal swabs using MasSpec Pen-ESI MS allows fast (under a minute) screening of COVID-19 disease using minimal operating steps and no specialized reagents, thus representing a promising alternative high-throughput method for screening of COVID-19.


Author(s):  
Н.Н. Беляев ◽  
О.А. Бебенина ◽  
В.Е. Бородкина

Предложен алгоритм распознавания, реализующий процедуры: обучения выбранных классификаторов и распознавания текстовых данных, учитывающие статистические характеристики распределения коэффициентов частотной области цифровых графических изображениях формата JPEG. The article presents an approach to development an algorithm for recognizing text data within JPEG format digital graphic images. Considered a hypothesis about influence text data content in JPEG digital graphic images on the distribution of values of the discrete cosine transformation coefficients in the frequency domain JPEG images of the format. Statistical classifiers models that provide a solution to the problem of recognition of text data in JPEG images based on analysis of its frequency domain have been determined. A recognition algorithm is proposed that implements the following procedures: training of selected classifiers and recognition of text data, taking into account the statistical characteristics of the distribution of frequency domain coefficients in JPEG format images.


2021 ◽  
Vol 12 (1) ◽  
pp. 17-28
Author(s):  
Amrit Pal Singh ◽  
Chetna Gupta ◽  
Rashpal Singh ◽  
Nandini Singh

Evolutionary algorithms are inspired by the biological model of evolution and natural selection and are used to solve computationally hard problems, also known as NP-hard problems. The main motive to use these algorithms is their robust and adaptive nature to provide best search techniques for complex problems. This paper presents a comparative analysis of classification of algorithm's family instead of algorithms comparison using KEEL tool. This work compares SSMA-C, DROP3PSO-C, FURIA-C, GFS-MaxLogitBoost-Cand CPSO-C algorithms. Further, these selected evolutionary algorithms are compared against two statistical classifiers using the Wilcoxon signed rank test and Friedman test on following datasets—bupa, ecoli, glass, haberman, iris, monks, vehicle, and wine—to calculate classification efficiencies of these algorithms. Experimental results reveal some differences among these algorithms. Visualization module in the model has been used to give overall results as a summary while statistical test using Clas-Wilcoxin-ST compared the algorithms in a pair-wise fashion to conclude experimental findings.


2020 ◽  
Vol 21 (S21) ◽  
Author(s):  
Laurence de Torrenté ◽  
Samuel Zimmerman ◽  
Masako Suzuki ◽  
Maximilian Christopeit ◽  
John M. Greally ◽  
...  

Abstract Background In genomics, we often assume that continuous data, such as gene expression, follow a specific kind of distribution. However we rarely stop to question the validity of this assumption, or consider how broadly applicable it may be to all genes that are in the transcriptome. Our study investigated the prevalence of a range of gene expression distributions in three different tumor types from the Cancer Genome Atlas (TCGA). Results Surprisingly, the expression of less than 50% of all genes was Normally-distributed, with other distributions including Gamma, Bimodal, Cauchy, and Lognormal also represented. Most of the distribution categories contained genes that were significantly enriched for unique biological processes. Different assumptions based on the shape of the expression profile were used to identify genes that could discriminate between patients with good versus poor survival. The prognostic marker genes that were identified when the shape of the distribution was accounted for reflected functional insights into cancer biology that were not observed when standard assumptions were applied. We showed that when multiple types of distributions were permitted, i.e. the shape of the expression profile was used, the statistical classifiers had greater predictive accuracy for determining the prognosis of a patient versus those that assumed only one type of gene expression distribution. Conclusions Our results highlight the value of studying a gene’s distribution shape to model heterogeneity of transcriptomic data and the impact on using analyses that permit more than one type of gene expression distribution. These insights would have been overlooked when using standard approaches that assume all genes follow the same type of distribution in a patient cohort.


2020 ◽  
Vol 10 (1) ◽  
Author(s):  
Doina Bucur

AbstractInformation flow, opinion, and epidemics spread over structured networks. When using node centrality indicators to predict which nodes will be among the top influencers or superspreaders, no single centrality is a consistently good ranker across networks. We show that statistical classifiers using two or more centralities are instead consistently predictive over many diverse, static real-world topologies. Certain pairs of centralities cooperate particularly well in drawing the statistical boundary between the superspreaders and the rest: a local centrality measuring the size of a node’s neighbourhood gains from the addition of a global centrality such as the eigenvector centrality, closeness, or the core number. Intuitively, this is because a local centrality may rank highly nodes which are located in locally dense, but globally peripheral regions of the network. The additional global centrality indicator guides the prediction towards more central regions. The superspreaders usually jointly maximise the values of both centralities. As a result of the interplay between centrality indicators, training classifiers with seven classical indicators leads to a nearly maximum average precision function (0.995) across the networks in this study.


2020 ◽  
Vol 77 (4) ◽  
pp. 674-683 ◽  
Author(s):  
Szymon Smoliński ◽  
Franziska Maria Schade ◽  
Florian Berg

The assignment of individual fish to its stock of origin is important for reliable stock assessment and fisheries management. Otolith shape is commonly used as the marker of distinct stocks in discrimination studies. Our literature review showed that the application and comparison of alternative statistical classifiers to discriminate fish stocks based on otolith shape is limited. Therefore, we compared the performance of two traditional and four machine learning classifiers based on Fourier analysis of otolith shape using selected stocks of Atlantic cod (Gadus morhua) in the southern Baltic Sea and Atlantic herring (Clupea harengus) in the western Norwegian Sea, Skagerrak, and the southern Baltic Sea. Our results showed that the stocks can be successfully discriminated based on their otolith shapes. We observed significant differences in the accuracy obtained by the tested classifiers. For both species, support vector machines (SVM) resulted in the highest classification accuracy. These findings suggest that modern machine learning algorithms, like SVM, can help to improve the accuracy of fish stock discrimination systems based on the otolith shape.


2020 ◽  
Vol 34 (10) ◽  
pp. 2077-2114
Author(s):  
Hexiang Bai ◽  
Feng Cao ◽  
M. Peter Atkinson ◽  
Qian Chen ◽  
Jinfeng Wang ◽  
...  

Sign in / Sign up

Export Citation Format

Share Document