statistical classifiers Latest Research Papers

The Use of Drone Photo Material to Classify the Purity of Photovoltaic Panels Based on Statistical Classifiers

Sensors ◽

10.3390/s22020483 ◽

2022 ◽

Vol 22 (2) ◽

pp. 483

Author(s):

Tomasz Czarnecki ◽

Kacper Bloch

Keyword(s):

Renewable Energy Sources ◽

Base Material ◽

Weather Conditions ◽

High Rate ◽

Solar Panels ◽

Soil Dust ◽

Photovoltaic Panels ◽

Statistical Classifiers ◽

Study Results ◽

The Subject

The subject of this work is the analysis of methods of detecting soiling of photovoltaic panels. Environmental and weather conditions affect the efficiency of renewable energy sources. Accumulation of soil, dust, and dirt on the surface of the solar panels reduces the power generated by the panels. This paper presents several variants of the algorithm that uses various statistical classifiers to classify photovoltaic panels in terms of soiling. The base material was high-resolution photos and videos of solar panels and sets dedicated to solar farms. The classifiers were tested and analyzed in their effectiveness in detecting soiling. Based on the study results, a group of optimal classifiers was defined, and the classifier selected that gives the best results for a given problem. The results obtained in this study proved experimentally that the proposed solution provides a high rate of correct detections. The proposed innovative method is cheap and straightforward to implement, and allows use in most photovoltaic installations.

Zernike Moment Based Classification of Cosmic Ray Candidate Hits from CMOS Sensors

Sensors ◽

10.3390/s21227718 ◽

2021 ◽

Vol 21 (22) ◽

pp. 7718

Author(s):

Olaf Bar ◽

Łukasz Bibrzycki ◽

Michał Niedźwiecki ◽

Marcin Piekarczyk ◽

Krzysztof Rzecki ◽

...

Keyword(s):

Recognition Accuracy ◽

Cosmic Ray ◽

Cmos Technology ◽

Zernike Moment ◽

Geometrical Properties ◽

Statistical Classifiers ◽

Image Function ◽

Feature Based ◽

Neural Network Classifiers

Reliable tools for artefact rejection and signal classification are a must for cosmic ray detection experiments based on CMOS technology. In this paper, we analyse the fitness of several feature-based statistical classifiers for the classification of particle candidate hits in four categories: spots, tracks, worms and artefacts. We use Zernike moments of the image function as feature carriers and propose a preprocessing and denoising scheme to make the feature extraction more efficient. As opposed to convolution neural network classifiers, the feature-based classifiers allow for establishing a connection between features and geometrical properties of candidate hits. Apart from basic classifiers we also consider their ensemble extensions and find these extensions generally better performing than basic versions, with an average recognition accuracy of 88%.

Sentiment Analysis of Sinhala News Comments

ACM Transactions on Asian and Low-Resource Language Information Processing ◽

10.1145/3445035 ◽

2021 ◽

Vol 20 (4) ◽

pp. 1-23

Author(s):

Surangika Ranathunga ◽

Isuru Udara Liyanage

Keyword(s):

Sentiment Analysis ◽

Word Embedding ◽

Semantic Features ◽

End User ◽

Language Resources ◽

Low Resource ◽

Statistical Classifiers ◽

User Comments ◽

Basic Language ◽

Analysis System

Sinhala is a low-resource language, for which basic language and linguistic tools have not been properly defined. This affects the development of NLP-based end-user applications for Sinhala. Thus, when implementing NLP tools such as sentiment analyzers, we have to rely only on language-independent techniques. This article presents the use of such language-independent techniques in implementing a sentiment analysis system for Sinhala news comments. We demonstrate that for low-resource languages such as Sinhala, the use of recently introduced word embedding models as semantic features can compensate for the lack of well-developed language-specific linguistic or language resources, and text classification with acceptable accuracy is indeed possible using both traditional statistical classifiers and Deep Learning models. The developed classification models, a corpus of 8.9 million tokens extracted from Sinhala news articles and user comments, and Sinhala Word2Vec and fastText word embedding models are now available for public use; 9,048 news comments annotated with POSITIVE/NEGATIVE/NEUTRAL polarities have also been released.

Rapid Screening of COVID-19 Disease Directly from Clinical Nasopharyngeal Swabs using the MasSpec Pen Technology

10.1101/2021.05.14.21257006 ◽

2021 ◽

Author(s):

Kyana Y Garza ◽

Alex Ap Rosini Silva ◽

Jonas R Rosa ◽

Michael F Keating ◽

Sydney C Povilaitis ◽

...

Keyword(s):

High Throughput ◽

Cross Validation ◽

Direct Analysis ◽

Lipid Profiles ◽

Rapid Screening ◽

Sampling Device ◽

Promising Alternative ◽

Gold Standard Method ◽

Statistical Classifiers ◽

Asymptomatic Individuals

The outbreak of COVID-19 has created an unprecedent global crisis. While PCR is the gold standard method for detecting active SARS-CoV-2 infection, alternative high-throughput diagnostic tests are of significant value to meet universal testing demands. Here, we describe a new design of the MasSpec Pen technology integrated to electrospray ionization (ESI) for direct analysis of clinical swabs and investigate its use for COVID-19 screening. The redesigned MasSpec Pen system incorporates a disposable sampling device refined for uniform and efficient analysis of swab tips via liquid extraction directly coupled to a ESI source. Using this system, we analyzed nasopharyngeal swabs from 244 individuals including symptomatic COVID-19 positive, symptomatic negative, and asymptomatic negative individuals, enabling rapid detection of rich lipid profiles. Two statistical classifiers were generated based on the lipid information aquired. Classifier 1 was built to distinguish symptomatic PCR-positive from asymptomatic PCR-negative individuals, yielding cross-validation accuracy of 83.5%, sensitivity of 76.6%, and specificity of 86.6%, and validation set accuracy of 89.6%, sensitivity of 100%, and specificity of 85.3%. Classifier 2 was built to distinguish symptomatic PCR-positive patients from negative individuals including symptomatic PCR-negative patients with moderate to severe symptoms and asymptomatic individuals, yielding a cross-validation accuracy of 78.4% accuracy, specificity of 77.21%, and sensitivity of 81.8%. Collectively, this study suggests that the lipid profiles detected directly from nasopharyngeal swabs using MasSpec Pen-ESI MS allows fast (under a minute) screening of COVID-19 disease using minimal operating steps and no specialized reagents, thus representing a promising alternative high-throughput method for screening of COVID-19.

DEVELOPEMENT AN ALGORITM FOR RECOGNIZING TEXT DATA IN DIGITAL GRAPHIC IMAGES

СИСТЕМЫ УПРАВЛЕНИЯ И ИНФОРМАЦИОННЫЕ ТЕХНОЛОГИИ ◽

10.36622/vstu.2021.84.2.016 ◽

2021 ◽

pp. 75-78

Author(s):

Н.Н. Беляев ◽

О.А. Бебенина ◽

В.Е. Бородкина

Keyword(s):

Frequency Domain ◽

Recognition Algorithm ◽

Statistical Characteristics ◽

Discrete Cosine Transformation ◽

Text Data ◽

Jpeg Images ◽

Statistical Classifiers ◽

Distribution Of Values ◽

Data Content ◽

Graphic Images

Предложен алгоритм распознавания, реализующий процедуры: обучения выбранных классификаторов и распознавания текстовых данных, учитывающие статистические характеристики распределения коэффициентов частотной области цифровых графических изображениях формата JPEG. The article presents an approach to development an algorithm for recognizing text data within JPEG format digital graphic images. Considered a hypothesis about influence text data content in JPEG digital graphic images on the distribution of values of the discrete cosine transformation coefficients in the frequency domain JPEG images of the format. Statistical classifiers models that provide a solution to the problem of recognition of text data in JPEG images based on analysis of its frequency domain have been determined. A recognition algorithm is proposed that implements the following procedures: training of selected classifiers and recognition of text data, taking into account the statistical characteristics of the distribution of frequency domain coefficients in JPEG format images.

A Comparative Analysis of Evolutionary Algorithms for Data Classification Using KEEL Tool

International Journal of Swarm Intelligence Research ◽

10.4018/ijsir.2021010102 ◽

2021 ◽

Vol 12 (1) ◽

pp. 17-28

Author(s):

Amrit Pal Singh ◽

Chetna Gupta ◽

Rashpal Singh ◽

Nandini Singh

Keyword(s):

Comparative Analysis ◽

Evolutionary Algorithms ◽

Rank Test ◽

Statistical Classifiers ◽

Signed Rank ◽

Main Motive ◽

Signed Rank Test ◽

Hard Problems ◽

Experimental Findings ◽

Adaptive Nature

Evolutionary algorithms are inspired by the biological model of evolution and natural selection and are used to solve computationally hard problems, also known as NP-hard problems. The main motive to use these algorithms is their robust and adaptive nature to provide best search techniques for complex problems. This paper presents a comparative analysis of classification of algorithm's family instead of algorithms comparison using KEEL tool. This work compares SSMA-C, DROP3PSO-C, FURIA-C, GFS-MaxLogitBoost-Cand CPSO-C algorithms. Further, these selected evolutionary algorithms are compared against two statistical classifiers using the Wilcoxon signed rank test and Friedman test on following datasets—bupa, ecoli, glass, haberman, iris, monks, vehicle, and wine—to calculate classification efficiencies of these algorithms. Experimental results reveal some differences among these algorithms. Visualization module in the model has been used to give overall results as a summary while statistical test using Clas-Wilcoxin-ST compared the algorithms in a pair-wise fashion to conclude experimental findings.

The shape of gene expression distributions matter: how incorporating distribution shape improves the interpretation of cancer transcriptomic data

BMC Bioinformatics ◽

10.1186/s12859-020-03892-w ◽

2020 ◽

Vol 21 (S21) ◽

Author(s):

Laurence de Torrenté ◽

Samuel Zimmerman ◽

Masako Suzuki ◽

Maximilian Christopeit ◽

John M. Greally ◽

...

Keyword(s):

Gene Expression ◽

Expression Profile ◽

Cancer Biology ◽

Predictive Accuracy ◽

The Cancer Genome Atlas ◽

Marker Genes ◽

Transcriptomic Data ◽

Statistical Classifiers ◽

Distribution Shape ◽

The Impact

Abstract Background In genomics, we often assume that continuous data, such as gene expression, follow a specific kind of distribution. However we rarely stop to question the validity of this assumption, or consider how broadly applicable it may be to all genes that are in the transcriptome. Our study investigated the prevalence of a range of gene expression distributions in three different tumor types from the Cancer Genome Atlas (TCGA). Results Surprisingly, the expression of less than 50% of all genes was Normally-distributed, with other distributions including Gamma, Bimodal, Cauchy, and Lognormal also represented. Most of the distribution categories contained genes that were significantly enriched for unique biological processes. Different assumptions based on the shape of the expression profile were used to identify genes that could discriminate between patients with good versus poor survival. The prognostic marker genes that were identified when the shape of the distribution was accounted for reflected functional insights into cancer biology that were not observed when standard assumptions were applied. We showed that when multiple types of distributions were permitted, i.e. the shape of the expression profile was used, the statistical classifiers had greater predictive accuracy for determining the prognosis of a patient versus those that assumed only one type of gene expression distribution. Conclusions Our results highlight the value of studying a gene’s distribution shape to model heterogeneity of transcriptomic data and the impact on using analyses that permit more than one type of gene expression distribution. These insights would have been overlooked when using standard approaches that assume all genes follow the same type of distribution in a patient cohort.

Top influencers can be identified universally by combining classical centralities

Scientific Reports ◽

10.1038/s41598-020-77536-7 ◽

2020 ◽

Vol 10 (1) ◽

Author(s):

Doina Bucur

Keyword(s):

Information Flow ◽

Real World ◽

Eigenvector Centrality ◽

Average Precision ◽

Statistical Classifiers ◽

The Core ◽

Node Centrality ◽

Statistical Boundary ◽

Central Regions ◽

Peripheral Regions

AbstractInformation flow, opinion, and epidemics spread over structured networks. When using node centrality indicators to predict which nodes will be among the top influencers or superspreaders, no single centrality is a consistently good ranker across networks. We show that statistical classifiers using two or more centralities are instead consistently predictive over many diverse, static real-world topologies. Certain pairs of centralities cooperate particularly well in drawing the statistical boundary between the superspreaders and the rest: a local centrality measuring the size of a node’s neighbourhood gains from the addition of a global centrality such as the eigenvector centrality, closeness, or the core number. Intuitively, this is because a local centrality may rank highly nodes which are located in locally dense, but globally peripheral regions of the network. The additional global centrality indicator guides the prediction towards more central regions. The superspreaders usually jointly maximise the values of both centralities. As a result of the interplay between centrality indicators, training classifiers with seven classical indicators leads to a nearly maximum average precision function (0.995) across the networks in this study.

Assessing the performance of statistical classifiers to discriminate fish stocks using Fourier analysis of otolith shape

Canadian Journal of Fisheries and Aquatic Sciences ◽

10.1139/cjfas-2019-0251 ◽

2020 ◽

Vol 77 (4) ◽

pp. 674-683 ◽

Cited By ~ 2

Author(s):

Szymon Smoliński ◽

Franziska Maria Schade ◽

Florian Berg

Keyword(s):

Machine Learning ◽

Fourier Analysis ◽

Baltic Sea ◽

Clupea Harengus ◽

Fish Stock ◽

Fish Stocks ◽

Otolith Shape ◽

Southern Baltic Sea ◽

Statistical Classifiers ◽

Southern Baltic

The assignment of individual fish to its stock of origin is important for reliable stock assessment and fisheries management. Otolith shape is commonly used as the marker of distinct stocks in discrimination studies. Our literature review showed that the application and comparison of alternative statistical classifiers to discriminate fish stocks based on otolith shape is limited. Therefore, we compared the performance of two traditional and four machine learning classifiers based on Fourier analysis of otolith shape using selected stocks of Atlantic cod (Gadus morhua) in the southern Baltic Sea and Atlantic herring (Clupea harengus) in the western Norwegian Sea, Skagerrak, and the southern Baltic Sea. Our results showed that the stocks can be successfully discriminated based on their otolith shapes. We observed significant differences in the accuracy obtained by the tested classifiers. For both species, support vector machines (SVM) resulted in the highest classification accuracy. These findings suggest that modern machine learning algorithms, like SVM, can help to improve the accuracy of fish stock discrimination systems based on the otolith shape.

Incorporating spatial association into statistical classifiers: local pattern-based prior tuning

International Journal of Geographical Information Science ◽

10.1080/13658816.2020.1737702 ◽

2020 ◽

Vol 34 (10) ◽

pp. 2077-2114

Author(s):

Hexiang Bai ◽

Feng Cao ◽

M. Peter Atkinson ◽

Qian Chen ◽

Jinfeng Wang ◽

...

Keyword(s):

Spatial Association ◽

Local Pattern ◽

Statistical Classifiers

statistical classifiers
Recently Published Documents

TOTAL DOCUMENTS

H-INDEX

The Use of Drone Photo Material to Classify the Purity of Photovoltaic Panels Based on Statistical Classifiers

Zernike Moment Based Classification of Cosmic Ray Candidate Hits from CMOS Sensors

Sentiment Analysis of Sinhala News Comments

Rapid Screening of COVID-19 Disease Directly from Clinical Nasopharyngeal Swabs using the MasSpec Pen Technology

DEVELOPEMENT AN ALGORITM FOR RECOGNIZING TEXT DATA IN DIGITAL GRAPHIC IMAGES

A Comparative Analysis of Evolutionary Algorithms for Data Classification Using KEEL Tool

The shape of gene expression distributions matter: how incorporating distribution shape improves the interpretation of cancer transcriptomic data

Top influencers can be identified universally by combining classical centralities

Assessing the performance of statistical classifiers to discriminate fish stocks using Fourier analysis of otolith shape

Incorporating spatial association into statistical classifiers: local pattern-based prior tuning

Export Citation Format

statistical classifiersRecently Published Documents

TOTAL DOCUMENTS

H-INDEX

The Use of Drone Photo Material to Classify the Purity of Photovoltaic Panels Based on Statistical Classifiers

Zernike Moment Based Classification of Cosmic Ray Candidate Hits from CMOS Sensors

Sentiment Analysis of Sinhala News Comments

Rapid Screening of COVID-19 Disease Directly from Clinical Nasopharyngeal Swabs using the MasSpec Pen Technology

DEVELOPEMENT AN ALGORITM FOR RECOGNIZING TEXT DATA IN DIGITAL GRAPHIC IMAGES

A Comparative Analysis of Evolutionary Algorithms for Data Classification Using KEEL Tool

The shape of gene expression distributions matter: how incorporating distribution shape improves the interpretation of cancer transcriptomic data

Top influencers can be identified universally by combining classical centralities

Assessing the performance of statistical classifiers to discriminate fish stocks using Fourier analysis of otolith shape

Incorporating spatial association into statistical classifiers: local pattern-based prior tuning

statistical classifiers
Recently Published Documents