scholarly journals DATA DIMENSIONALITY REDUCTION FOR NEURAL BASED CLASSIFICATION OF OPTICAL SURFACES DEFECTS

2014 ◽  
pp. 32-42
Author(s):  
Matthieu Voiry ◽  
Kurosh Madani ◽  
Véronique Véronique Amarger ◽  
Joël Bernier

A major step for high-quality optical surfaces faults diagnosis concerns scratches and digs defects characterization in products. This challenging operation is very important since it is directly linked with the produced optical component’s quality. A classification phase is mandatory to complete optical devices diagnosis since a number of correctable defects are usually present beside the potential “abiding” ones. Unfortunately relevant data extracted from raw image during defects detection phase are high dimensional. This can have harmful effect on the behaviors of artificial neural networks which are suitable to perform such a challenging classification. Reducing data dimension to a smaller value can decrease the problems related to high dimensionality. In this paper we compare different techniques which permit dimensionality reduction and evaluate their impact on classification tasks performances.

Author(s):  
Matthieu Voiry ◽  
Véronique Amarger ◽  
Joel Bernier ◽  
Kurosh Madani

A major step for high-quality optical devices faults diagnosis concerns scratches and digs defects detection and characterization in products. These kinds of aesthetic flaws, shaped during different manufacturing steps, could provoke harmful effects on optical devices’ functional specificities, as well as on their optical performances by generating undesirable scatter light, which could seriously damage the expected optical features. A reliable diagnosis of these defects becomes therefore a crucial task to ensure products’ nominal specification. Moreover, such diagnosis is strongly motivated by manufacturing process correction requirements in order to guarantee mass production quality with the aim of maintaining acceptable production yield. Unfortunately, detecting and measuring such defects is still a challenging problem in production conditions and the few available automatic control solutions remain ineffective. That’s why, in most of cases, the diagnosis is performed on the basis of a human expert based visual inspection of the whole production. However, this conventionally used solution suffers from several acute restrictions related to human operator’s intrinsic limitations (reduced sensitivity for very small defects, detection exhaustiveness alteration due to attentiveness shrinkage, operator’s tiredness and weariness due to repetitive nature of fault detection and fault diagnosis tasks). To construct an effective automatic diagnosis system, we propose an approach based on four main operations: defect detection, data extraction, dimensionality reduction and neural classification. The first operation is based on Nomarski microscopy issued imaging. These issued images contain several items which have to be detected and then classified in order to discriminate between “false” defects (correctable defects) and “abiding” (permanent) ones. Indeed, because of industrial environment, a number of correctable defects (like dusts or cleaning marks) are usually present beside the potential “abiding” defects. Relevant features extraction is a key issue to ensure accuracy of neural classification system; first because raw data (images) cannot be exploited and, moreover, because dealing with high dimensional data could affect learning performances of neural network. This article presents the automatic diagnosis system, describing the operations of the different phases. An implementation on real industrial optical devices is carried out and an experiment investigates a MLP artificial neural network based items classification.


1994 ◽  
Vol 05 (04) ◽  
pp. 313-333 ◽  
Author(s):  
MARK DOLSON

Multi-Layer Perceptron (MLP) neural networks have been used extensively for classification tasks. Typically, the MLP network is trained explicitly to produce the correct classification as its output. For speech recognition, however, several investigators have recently experimented with an indirect approach: a unique MLP predictive network is trained for each class of data, and classification is accomplished by determining which predictive network serves as the best model for samples of unknown speech. Results from this approach have been mixed. In this report, we compare the direct and indirect approaches to classification from a more fundamental perspective. We show how recent advances in nonlinear dimensionality reduction can be incorporated into the indirect approach, and we show how the two approaches can be integrated in a novel MLP framework. We further show how these new MLP networks can be usefully viewed as generalizations of Learning Vector Quantization (LVQ) and of subspace methods of pattern recognition. Lastly, we show that applying these ideas to the classification of temporal trajectories can substantially improve performance on simple tasks.


2021 ◽  
Vol 12 ◽  
Author(s):  
Jianping Zhao ◽  
Na Wang ◽  
Haiyun Wang ◽  
Chunhou Zheng ◽  
Yansen Su

Dimensionality reduction of high-dimensional data is crucial for single-cell RNA sequencing (scRNA-seq) visualization and clustering. One prominent challenge in scRNA-seq studies comes from the dropout events, which lead to zero-inflated data. To address this issue, in this paper, we propose a scRNA-seq data dimensionality reduction algorithm based on a hierarchical autoencoder, termed SCDRHA. The proposed SCDRHA consists of two core modules, where the first module is a deep count autoencoder (DCA) that is used to denoise data, and the second module is a graph autoencoder that projects the data into a low-dimensional space. Experimental results demonstrate that SCDRHA has better performance than existing state-of-the-art algorithms on dimension reduction and noise reduction in five real scRNA-seq datasets. Besides, SCDRHA can also dramatically improve the performance of data visualization and cell clustering.


Author(s):  
Xiaofeng Zhu ◽  
Cong Lei ◽  
Hao Yu ◽  
Yonggang Li ◽  
Jiangzhang Gan ◽  
...  

In this paper, we propose conducting Robust Graph Dimensionality Reduction (RGDR) by learning a transformation matrix to map original high-dimensional data into their low-dimensional intrinsic space without the influence of outliers. To do this, we propose simultaneously 1) adaptively learning three variables, \ie a reverse graph embedding of original data, a transformation matrix, and a graph matrix preserving the local similarity of original data in their low-dimensional intrinsic space; and 2) employing robust estimators to  avoid outliers involving the processes of optimizing these three matrices. As a result, original data are cleaned by two strategies, \ie a prediction of original data based on three resulting variables and robust estimators, so that the transformation matrix can be learnt from accurately estimated intrinsic space with the helping of the reverse graph embedding and the graph matrix. Moreover, we propose a new optimization algorithm to the resulting objective function as well as theoretically prove the convergence of our optimization algorithm. Experimental results indicated that our proposed method outperformed all the comparison methods in terms of different classification tasks.


2011 ◽  
Vol 58-60 ◽  
pp. 547-550
Author(s):  
Di Wu ◽  
Zhao Zheng

In real world, high-dimensional data are everywhere, but the nature structure behind them is always featured by only a few parameters. With the rapid development of computer vision, more and more data dimensionality reduction problems are involved, this leads to the rapid development of dimensionality reduction algorithms. Linear method such as LPP [1], NPE [2], nonlinear method such as LLE [3] and improvement version kernel NPE. One particularly simple but effective assumption in face recognition is that the samples from the same class lie on a linear subspace, so lots of nonlinear methods only perform well on some artificial data sets. This paper emphasizes on NPE and SPP [4] come up with recently, and combines these methods, the experiments show the effect of new method outperform some classic unsupervised methods.


2018 ◽  
Vol 10 (10) ◽  
pp. 1564 ◽  
Author(s):  
Patrick Bradley ◽  
Sina Keller ◽  
Martin Weinmann

In this paper, we investigate the potential of unsupervised feature selection techniques for classification tasks, where only sparse training data are available. This is motivated by the fact that unsupervised feature selection techniques combine the advantages of standard dimensionality reduction techniques (which only rely on the given feature vectors and not on the corresponding labels) and supervised feature selection techniques (which retain a subset of the original set of features). Thus, feature selection becomes independent of the given classification task and, consequently, a subset of generally versatile features is retained. We present different techniques relying on the topology of the given sparse training data. Thereby, the topology is described with an ultrametricity index. For the latter, we take into account the Murtagh Ultrametricity Index (MUI) which is defined on the basis of triangles within the given data and the Topological Ultrametricity Index (TUI) which is defined on the basis of a specific graph structure. In a case study addressing the classification of high-dimensional hyperspectral data based on sparse training data, we demonstrate the performance of the proposed unsupervised feature selection techniques in comparison to standard dimensionality reduction and supervised feature selection techniques on four commonly used benchmark datasets. The achieved classification results reveal that involving supervised feature selection techniques leads to similar classification results as involving unsupervised feature selection techniques, while the latter perform feature selection independently from the given classification task and thus deliver generally versatile features.


2020 ◽  
Author(s):  
Oxana Ye. Rodionova ◽  
Sergey Kucheryavskiy ◽  
Alexey L. Pomerantsev

<div><div><div><p>Basic tools for exploration and interpretation of Principal Component Analysis (PCA) results are well- known and thoroughly described in many comprehensive tutorials. However, in the recent decade, several new tools have been developed. Some of them were originally created for solving authentication and classification tasks. In this paper we demonstrate that they can also be useful for the exploratory data analysis.</p><p><br></p><p>We discuss several important aspects of the PCA exploration of high dimensional datasets, such as estimation of a proper complexity of PCA model, dependence on the data structure, presence of outliers, etc. We introduce new tools for the assessment of the PCA model complexity such as the plots of the degrees of freedom developed for the orthogonal and score distances, as well as the Extreme and Distance plots, which present a new look at the features of the training and test (new) data. These tools are simple and fast in computation. In some cases, they are more efficient than the conventional PCA tools. A simulated example provides an intuitive illustration of their application. Three real-world examples originated from various fields are employed to demonstrate capabilities of the new tools and ways they can be used. The first example considers the reproducibility of a handheld spectrometer using a dataset that is presented for the first time. The other two datasets, which describe the authentication of olives in brine and classification of wines by their geographical origin, are already known and are often used for the illustrative purposes.</p><p><br></p><p>The paper does not touch upon the well-known things, such as the algorithms for the PCA decomposition, or interpretation of scores and loadings. Instead, we pay attention primarily to more advanced topics, such as exploration of data homogeneity, understanding and evaluation of an optimal model complexity. The examples are accompanied by links to free software that implements the tools.</p></div></div></div>


2020 ◽  
Author(s):  
Oxana Ye. Rodionova ◽  
Sergey Kucheryavskiy ◽  
Alexey L. Pomerantsev

<div><div><div><p>Basic tools for exploration and interpretation of Principal Component Analysis (PCA) results are well- known and thoroughly described in many comprehensive tutorials. However, in the recent decade, several new tools have been developed. Some of them were originally created for solving authentication and classification tasks. In this paper we demonstrate that they can also be useful for the exploratory data analysis.</p><p><br></p><p>We discuss several important aspects of the PCA exploration of high dimensional datasets, such as estimation of a proper complexity of PCA model, dependence on the data structure, presence of outliers, etc. We introduce new tools for the assessment of the PCA model complexity such as the plots of the degrees of freedom developed for the orthogonal and score distances, as well as the Extreme and Distance plots, which present a new look at the features of the training and test (new) data. These tools are simple and fast in computation. In some cases, they are more efficient than the conventional PCA tools. A simulated example provides an intuitive illustration of their application. Three real-world examples originated from various fields are employed to demonstrate capabilities of the new tools and ways they can be used. The first example considers the reproducibility of a handheld spectrometer using a dataset that is presented for the first time. The other two datasets, which describe the authentication of olives in brine and classification of wines by their geographical origin, are already known and are often used for the illustrative purposes.</p><p><br></p><p>The paper does not touch upon the well-known things, such as the algorithms for the PCA decomposition, or interpretation of scores and loadings. Instead, we pay attention primarily to more advanced topics, such as exploration of data homogeneity, understanding and evaluation of an optimal model complexity. The examples are accompanied by links to free software that implements the tools.</p></div></div></div>


Sign in / Sign up

Export Citation Format

Share Document