scholarly journals Confusion Matrix Analysis of Syllable-Like Unit Extracted from Hindi Continuous Speech

2017 ◽  
Vol 10 (19) ◽  
pp. 1-6
Author(s):  
Archana Balyan ◽  
Author(s):  
Ioannis Markoulidakis ◽  
George Kopsiaftis ◽  
Ioannis Rallis ◽  
Ioannis Georgoulas ◽  
Anastasios Doulamis ◽  
...  

Sensors ◽  
2020 ◽  
Vol 20 (22) ◽  
pp. 6666
Author(s):  
Kamil Książek ◽  
Michał Romaszewski ◽  
Przemysław Głomb ◽  
Bartosz Grabowski ◽  
Michał Cholewa

In recent years, growing interest in deep learning neural networks has raised a question on how they can be used for effective processing of high-dimensional datasets produced by hyperspectral imaging (HSI). HSI, traditionally viewed as being within the scope of remote sensing, is used in non-invasive substance classification. One of the areas of potential application is forensic science, where substance classification on the scenes is important. An example problem from that area—blood stain classification—is a case study for the evaluation of methods that process hyperspectral data. To investigate the deep learning classification performance for this problem we have performed experiments on a dataset which has not been previously tested using this kind of model. This dataset consists of several images with blood and blood-like substances like ketchup, tomato concentrate, artificial blood, etc. To test both the classic approach to hyperspectral classification and a more realistic application-oriented scenario, we have prepared two different sets of experiments. In the first one, Hyperspectral Transductive Classification (HTC), both a training and a test set come from the same image. In the second one, Hyperspectral Inductive Classification (HIC), a test set is derived from a different image, which is more challenging for classifiers but more useful from the point of view of forensic investigators. We conducted the study using several architectures like 1D, 2D and 3D convolutional neural networks (CNN), a recurrent neural network (RNN) and a multilayer perceptron (MLP). The performance of the models was compared with baseline results of Support Vector Machine (SVM). We have also presented a model evaluation method based on t-SNE and confusion matrix analysis that allows us to detect and eliminate some cases of model undertraining. Our results show that in the transductive case, all models, including the MLP and the SVM, have comparative performance, with no clear advantage of deep learning models. The Overall Accuracy range across all models is 98–100% for the easier image set, and 74–94% for the more difficult one. However, in a more challenging inductive case, selected deep learning architectures offer a significant advantage; their best Overall Accuracy is in the range of 57–71%, improving the baseline set by the non-deep models by up to 9 percentage points. We have presented a detailed analysis of results and a discussion, including a summary of conclusions for each tested architecture. An analysis of per-class errors shows that the score for each class is highly model-dependent. Considering this and the fact that the best performing models come from two different architecture families (3D CNN and RNN), our results suggest that tailoring the deep neural network architecture to hyperspectral data is still an open problem.


2006 ◽  
Vol 49 (6) ◽  
pp. 798-802
Author(s):  
Shoichiro Fukuda ◽  
Naomi Toida ◽  
Kunihiro Fukushima ◽  
Yuko Kataoka ◽  
Kazunori Nishizaki

Author(s):  
Raymond D. Engstrand ◽  
George Moeller

The Constant-Ratio Rule (CRR), an empirical technique for analysis of confusion matrices, was developed for use in predicting intelligibility of speech syllables. This study investigated the validity of the rule when applied to the data from experiments on visual form perception. English letters and simple geometric figures were tachistoscopically presented in the center of a viewing field. Response proportions for subsets of this master set of stimuli were predicted by CRR. Results indicated that the rule (1) accurately predicted numeric response proportions for subsets of stimuli when experimental conditions were similar and (2) predicted ordinally accurate data when experimental conditions varied within the limit which might be encountered in “operational situations.” These results, as well as arithmetic factors which can result in errors in prediction, are discussed.


2021 ◽  
Vol 11 (23) ◽  
pp. 11136
Author(s):  
Zenebe Markos Lonseko ◽  
Prince Ebenezer Adjei ◽  
Wenju Du ◽  
Chengsi Luo ◽  
Dingcan Hu ◽  
...  

Gastrointestinal (GI) diseases constitute a leading problem in the human digestive system. Consequently, several studies have explored automatic classification of GI diseases as a means of minimizing the burden on clinicians and improving patient outcomes, for both diagnostic and treatment purposes. The challenge in using deep learning-based (DL) approaches, specifically a convolutional neural network (CNN), is that spatial information is not fully utilized due to the inherent mechanism of CNNs. This paper proposes the application of spatial factors in improving classification performance. Specifically, we propose a deep CNN-based spatial attention mechanism for the classification of GI diseases, implemented with encoder–decoder layers. To overcome the data imbalance problem, we adapt data-augmentation techniques. A total of 12,147 multi-sited, multi-diseased GI images, drawn from publicly available and private sources, were used to validate the proposed approach. Furthermore, a five-fold cross-validation approach was adopted to minimize inconsistencies in intra- and inter-class variability and to ensure that results were robustly assessed. Our results, compared with other state-of-the-art models in terms of mean accuracy (ResNet50 = 90.28, GoogLeNet = 91.38, DenseNets = 91.60, and baseline = 92.84), demonstrated better outcomes (Precision = 92.8, Recall = 92.7, F1-score = 92.8, and Accuracy = 93.19). We also implemented t-distributed stochastic neighbor embedding (t–SNE) and confusion matrix analysis techniques for better visualization and performance validation. Overall, the results showed that the attention mechanism improved the automatic classification of multi-sited GI disease images. We validated clinical tests based on the proposed method by overcoming previous limitations, with the goal of improving automatic classification accuracy in future work.


2021 ◽  
Author(s):  
Umadevi T P ◽  
Murugan A

The handwritten Multilanguage phase is the preprocessing phase that improves the image quality for better identification in the system. The main goals of preprocessing are diodes, noise suppression and line cancellation. After word processing, various attribute extraction techniques are used to process attribute properties for the identification process. Smoothing plays an important role in character recognition. The partitioning process in the word distribution strategy can be divided into global and local texts. The writer does not use this header line to write the text which creates a problem for skew correction, classification and recognition. The dataset used are HWSC and TST1. The tensor flow method is used to estimate the consistency of confusion matrix for the enhancement of the text recognition .The accuracy of the proposed method is 98%.


Sign in / Sign up

Export Citation Format

Share Document