Interface Style and Error Rates - Some Experimental Results

Author(s):  
Peter F. Elzer ◽  
Badi Boussoffara ◽  
Carsten Beuthel

At the IPP a number of new forms of visualizations of process values have been developed. Several of them have been evaluated by user experiments. In the context of a research project (supported by the Volkswagen Foundation in Germany (Ref. Nr.: I/69 886)) a comparative study with respect to the influence of the interface style on the error rate during classification of process states was undertaken. The paper describes the results and discusses them in a taxonomy context.

2010 ◽  
Vol 51 ◽  
Author(s):  
Lijana Stabingienė ◽  
Kęstutis Dučinskas

In spatial classification it is usually assumed that features observations given labels are independently distributed. We have retracted this assumption by proposing stationary Gaussian random field model for features observations. The label are assumed to follow Disrete Random Field (DRF) model. Formula for exact error rate based on Bayes discriminant function (BDF) is derived. In the case of partial parametric uncertainty (mean parameters and variance are unknown), the approximation of the expected error rate associated with plug-in BDF is also derived. The dependence of considered error rates on the values of range and clustering parameters is investigated numerically for training locations being second-order neighbors to location of observation to be classified.


2016 ◽  
Vol 55 (04) ◽  
pp. 373-380 ◽  
Author(s):  
Matthias Ganzinger ◽  
Karsten Senghas ◽  
Stefan Riezler ◽  
Petra Knaup ◽  
Martin Löpprich ◽  
...  

SummaryObjectives: In the Multiple Myeloma clinical registry at Heidelberg University Hospital, most data are extracted from discharge letters. Our aim was to analyze if it is possible to make the manual documentation process more efficient by using methods of natural language processing for multiclass classification of free-text diagnostic reports to automatically document the diagnosis and state of disease of myeloma patients. The first objective was to create a corpus consisting of free-text diagnosis paragraphs of patients with multiple myeloma from German diagnostic reports, and its manual annotation of relevant data elements by documentation specialists. The second objective was to construct and evaluate a framework using different NLP methods to enable automatic multiclass classification of relevant data elements from free-text diagnostic reports.Methods: The main diagnoses paragraph was extracted from the clinical report of one third randomly selected patients of the multiple myeloma research database from Heidelberg University Hospital (in total 737 selected patients). An EDC system was setup and two data entry specialists performed independently a manual documentation of at least nine specific data elements for multiple myeloma characterization. Both data entries were compared and assessed by a third specialist and an annotated text corpus was created. A framework was constructed, consisting of a self-developed package to split multiple diagnosis sequences into several subsequences, four different preprocessing steps to normalize the input data and two classifiers: a maximum entropy classifier (MEC) and a support vector machine (SVM). In total 15 different pipelines were examined and assessed by a ten-fold cross-validation, reiterated 100 times. For quality indication the average error rate and the average F1-score were conducted. For significance testing the approximate randomization test was used.Results: The created annotated corpus consists of 737 different diagnoses paragraphs with a total number of 865 coded diagnosis. The dataset is publicly available in the supplementary online files for training and testing of further NLP methods. Both classifiers showed low average error rates (MEC: 1.05; SVM: 0.84) and high F1-scores (MEC: 0.89; SVM: 0.92). However the results varied widely depending on the classified data ele -ment. Preprocessing methods increased this effect and had significant impact on the classification, both positive and negative. The automatic diagnosis splitter increased the average error rate significantly, even if the F1-score decreased only slightly.Conclusions: The low average error rates and high average F1-scores of each pipeline demonstrate the suitability of the investigated NPL methods. However, it was also shown that there is no best practice for an automatic classification of data elements from free-text diagnostic reports.


2008 ◽  
Vol 20 (06) ◽  
pp. 345-352
Author(s):  
Li-Yeh Chuang ◽  
Cheng-San Yang ◽  
Jung-Chike Li ◽  
Cheng-Hong Yang

Microarray data can provide valuable results for a variety of gene expression profile problems and contribute to advances in clinical medicine. The application of microarray data on cancer-type classification has recently gained in popularity. The properties of microarray data contain a large number of features (genes) with high dimensions, and one in the multi-class category. These facts make testing and training of general classification methods difficult. Reducing the number of genes and achieving lower classification error rates are the main issues to be solved. The classification of microarray data samples can be regarded as a feature selection and classifier design problem. The goal of feature selection is to select those subsets of differentially expressed genes that are potentially relevant for distinguishing the sample classes. Classical genetic algorithms (GAs) may suffer from premature convergence and thus lead to poor experimental results. In this paper, combat genetic algorithm (CGA) is used to implement the feature selection, and a K-nearest neighbor with the leave-one-out cross-validation method serves as a classifier of the CGA fitness function for the classification problem. The proposed method was applied to 10 microarray data sets that were obtained from the literature. The experimental results show that the proposed method not only effectively reduced the number of gene expression levels but also achieved lower classification error rates.


2021 ◽  
Vol 3 (1) ◽  
pp. 16-22
Author(s):  
Abderrahmane Khechekhouche ◽  
Abderrrahim Allal ◽  
Zied Driss

This work is a comparative study between the various advanced technologies of diagnosis of induction motors published recently and to make a classification of these diagnostic techniques according to their sensitivities from experimental results of stator short-circuit faults between stator turns. By using the logarithmic FFT spectrum, we can discover the best method to detect faults in their early stages so that we can predict their faults and anticipate breakdowns that can be dangerous for people or the economy.


2019 ◽  
Vol 28 (4) ◽  
pp. 1411-1431 ◽  
Author(s):  
Lauren Bislick ◽  
William D. Hula

Purpose This retrospective analysis examined group differences in error rate across 4 contextual variables (clusters vs. singletons, syllable position, number of syllables, and articulatory phonetic features) in adults with apraxia of speech (AOS) and adults with aphasia only. Group differences in the distribution of error type across contextual variables were also examined. Method Ten individuals with acquired AOS and aphasia and 11 individuals with aphasia participated in this study. In the context of a 2-group experimental design, the influence of 4 contextual variables on error rate and error type distribution was examined via repetition of 29 multisyllabic words. Error rates were analyzed using Bayesian methods, whereas distribution of error type was examined via descriptive statistics. Results There were 4 findings of robust differences between the 2 groups. These differences were found for syllable position, number of syllables, manner of articulation, and voicing. Group differences were less robust for clusters versus singletons and place of articulation. Results of error type distribution show a high proportion of distortion and substitution errors in speakers with AOS and a high proportion of substitution and omission errors in speakers with aphasia. Conclusion Findings add to the continued effort to improve the understanding and assessment of AOS and aphasia. Several contextual variables more consistently influenced breakdown in participants with AOS compared to participants with aphasia and should be considered during the diagnostic process. Supplemental Material https://doi.org/10.23641/asha.9701690


2020 ◽  
Vol 75 (5) ◽  
pp. 507-511
Author(s):  
Y. El-Ouardi ◽  
A. Dadouch ◽  
A. Aknouch ◽  
M. Mouhib ◽  
A. Maghnouj ◽  
...  

2014 ◽  
Vol 53 (05) ◽  
pp. 343-343

We have to report marginal changes in the empirical type I error rates for the cut-offs 2/3 and 4/7 of Table 4, Table 5 and Table 6 of the paper “Influence of Selection Bias on the Test Decision – A Simulation Study” by M. Tamm, E. Cramer, L. N. Kennes, N. Heussen (Methods Inf Med 2012; 51: 138 –143). In a small number of cases the kind of representation of numeric values in SAS has resulted in wrong categorization due to a numeric representation error of differences. We corrected the simulation by using the round function of SAS in the calculation process with the same seeds as before. For Table 4 the value for the cut-off 2/3 changes from 0.180323 to 0.153494. For Table 5 the value for the cut-off 4/7 changes from 0.144729 to 0.139626 and the value for the cut-off 2/3 changes from 0.114885 to 0.101773. For Table 6 the value for the cut-off 4/7 changes from 0.125528 to 0.122144 and the value for the cut-off 2/3 changes from 0.099488 to 0.090828. The sentence on p. 141 “E.g. for block size 4 and q = 2/3 the type I error rate is 18% (Table 4).” has to be replaced by “E.g. for block size 4 and q = 2/3 the type I error rate is 15.3% (Table 4).”. There were only minor changes smaller than 0.03. These changes do not affect the interpretation of the results or our recommendations.


2021 ◽  
Vol 503 (2) ◽  
pp. 1828-1846
Author(s):  
Burger Becker ◽  
Mattia Vaccari ◽  
Matthew Prescott ◽  
Trienko Grobler

ABSTRACT The morphological classification of radio sources is important to gain a full understanding of galaxy evolution processes and their relation with local environmental properties. Furthermore, the complex nature of the problem, its appeal for citizen scientists, and the large data rates generated by existing and upcoming radio telescopes combine to make the morphological classification of radio sources an ideal test case for the application of machine learning techniques. One approach that has shown great promise recently is convolutional neural networks (CNNs). Literature, however, lacks two major things when it comes to CNNs and radio galaxy morphological classification. First, a proper analysis of whether overfitting occurs when training CNNs to perform radio galaxy morphological classification using a small curated training set is needed. Secondly, a good comparative study regarding the practical applicability of the CNN architectures in literature is required. Both of these shortcomings are addressed in this paper. Multiple performance metrics are used for the latter comparative study, such as inference time, model complexity, computational complexity, and mean per class accuracy. As part of this study, we also investigate the effect that receptive field, stride length, and coverage have on recognition performance. For the sake of completeness, we also investigate the recognition performance gains that we can obtain by employing classification ensembles. A ranking system based upon recognition and computational performance is proposed. MCRGNet, Radio Galaxy Zoo, and ConvXpress (novel classifier) are the architectures that best balance computational requirements with recognition performance.


Sign in / Sign up

Export Citation Format

Share Document