misclassification errors
Recently Published Documents


TOTAL DOCUMENTS

79
(FIVE YEARS 19)

H-INDEX

14
(FIVE YEARS 1)

2022 ◽  
Vol 14 (2) ◽  
pp. 325
Author(s):  
Daniela Palacios-Lopez ◽  
Thomas Esch ◽  
Kytt MacManus ◽  
Mattia Marconcini ◽  
Alessandro Sorichetta ◽  
...  

Large-scale gridded population datasets available at the global or continental scale have become an important source of information in applications related to sustainable development. In recent years, the emergence of new population models has leveraged the inclusion of more accurate and spatially detailed proxy layers describing the built-up environment (e.g., built-area and building footprint datasets), enhancing the quality, accuracy and spatial resolution of existing products. However, due to the consistent lack of vertical and functional information on the built-up environment, large-scale gridded population datasets that rely on existing built-up land proxies still report large errors of under- and overestimation, especially in areas with predominantly high-rise buildings or industrial/commercial areas, respectively. This research investigates, for the first time, the potential contributions of the new World Settlement Footprint—3D (WSF3D) dataset in the field of large-scale population modelling. First, we combined a Random Forest classifier with spatial metrics derived from the WSF3D to predict the industrial versus non-industrial use of settlement pixels at the Pan-European scale. We then examined the effects of including volume and settlement use information into frameworks of dasymetric population modelling. We found that the proposed classification method can predict industrial and non-industrial areas with overall accuracies and a kappa-coefficient of ~84% and 0.68, respectively. Additionally, we found that both, integrating volume and settlement use information considerably increased the accuracy of population estimates between 10% and 30% over commonly employed models (e.g., based on a binary settlement mask as input), mainly by eliminating systematic large overestimations in industrial/commercial areas. While the proposed method shows strong promise for overcoming some of the main limitations in large-scale population modelling, future research should focus on improving the quality of the WFS3D dataset and the classification method alike, to avoid the false detection of built-up settlements and to reduce misclassification errors of industrial and high-rise buildings.


PLoS ONE ◽  
2021 ◽  
Vol 16 (10) ◽  
pp. e0258693
Author(s):  
Yuval Bussi ◽  
Ruti Kapon ◽  
Ziv Reich

Information theoretic approaches are ubiquitous and effective in a wide variety of bioinformatics applications. In comparative genomics, alignment-free methods, based on short DNA words, or k-mers, are particularly powerful. We evaluated the utility of varying k-mer lengths for genome comparisons by analyzing their sequence space coverage of 5805 genomes in the KEGG GENOME database. In subsequent analyses on four k-mer lengths spanning the relevant range (11, 21, 31, 41), hierarchical clustering of 1634 genus-level representative genomes using pairwise 21- and 31-mer Jaccard similarities best recapitulated a phylogenetic/taxonomic tree of life with clear boundaries for superkingdom domains and high subtree similarity for named taxons at lower levels (family through phylum). By analyzing ~14.2M prokaryotic genome comparisons by their lowest-common-ancestor taxon levels, we detected many potential misclassification errors in a curated database, further demonstrating the need for wide-scale adoption of quantitative taxonomic classifications based on whole-genome similarity.


Author(s):  
George Petrides ◽  
Wouter Verbeke

AbstractOver the years, a plethora of cost-sensitive methods have been proposed for learning on data when different types of misclassification errors incur different costs. Our contribution is a unifying framework that provides a comprehensive and insightful overview on cost-sensitive ensemble methods, pinpointing their differences and similarities via a fine-grained categorization. Our framework contains natural extensions and generalisations of ideas across methods, be it AdaBoost, Bagging or Random Forest, and as a result not only yields all methods known to date but also some not previously considered.


Author(s):  
Edward K. Ngailo ◽  
Dietrich Von Rosen ◽  
Martin Singull

We propose asymptotic approximations for the probabilities of misclassification in linear discriminant analysis when the group means follow a growth curve structure. The discriminant function can classify a new observation vector of p repeated measurements into one of several multivariate normal populations with equal covariance matrix. We derive certain relations of the statistics under consideration in order to obtain asymptotic approximation of misclassification errors for the two group case. Finally, we perform Monte Carlo simulations to evaluate the reliability of the proposed results.


Author(s):  
Dalila Binti Abu Bakar, Et. al.

We investigate if Malaysian listed companies engaged in financial information fraud during financial distressed after two years of US subprime mortgage crisis.We also investigate the impact of financial information fraudulence in bankruptcy prediction and misclassification errors. This study used consumer product companies listed on the main board and the timeframe is from 2011 till 2015. The Altman Z score indicates that 37 out of 133 Malaysian consumer product companies are financially distressed. Meanwhile, the M score shows that 28 out of 224observations are engaged in financial information fraudulence. However, these results are relatively low because the samples are taken from the main board and fraudulence in their financial statements might be done in lower magnitude in order to avoid sanctions by the Security Exchange Commission. Logistic regression was used to measure the predicting accuracy. The result of the overall accuracy percentage slightly improved by 0.9 after eliminating fraudulent companies. The confusion matrix result i.e. before and after the removal of financial information fraudulent companies, the misclassification errors especially type one has improved. This finding satisfied objective three, whereby one of the reasons for the deterioration in financial distress prediction is due to the upward bias of financial information fraudulence.Governments, monitoring bodies, and all those involved in an insolvency process would benefit from this study.


2021 ◽  
Vol 24 (4) ◽  
Author(s):  
Simon Gregson ◽  
Louisa Moorhouse ◽  
Tawanda Dadirai ◽  
Haynes Sheppard ◽  
Justin Mayini ◽  
...  

Author(s):  
Edgar Santos‐Fernandez ◽  
Erin E. Peterson ◽  
Julie Vercelloni ◽  
Em Rushworth ◽  
Kerrie Mengersen

Sign in / Sign up

Export Citation Format

Share Document