dimensionality reduction method
Recently Published Documents


TOTAL DOCUMENTS

167
(FIVE YEARS 61)

H-INDEX

14
(FIVE YEARS 3)

AI ◽  
2021 ◽  
Vol 3 (1) ◽  
pp. 1-22
Author(s):  
Jean-Sébastien Dessureault ◽  
Daniel Massicotte

This paper examines the critical decision process of reducing the dimensionality of a dataset before applying a clustering algorithm. It is always a challenge to choose between extracting or selecting features. It is not obvious to evaluate the importance of the features since the most popular methods to do it are usually intended for a supervised learning technique process. This paper proposes a novel method called “Decision Process for Dimensionality Reduction before Clustering” (DPDRC). It chooses the best dimensionality reduction method (selection or extraction) according to the data scientist’s parameters and the profile of the data, aiming to apply a clustering process at the end. It uses a Feature Ranking Process Based on Silhouette Decomposition (FRSD) algorithm, a Principal Component Analysis (PCA) algorithm, and a K-means algorithm along with its metric, the Silhouette Index (SI). This paper presents five scenarios based on different parameters. This research also aims to discuss the impacts, advantages, and disadvantages of each choice that can be made in this unsupervised learning process.


Author(s):  
Zhengtao Guo ◽  
Wuli Chu

It is essential for engineering manufacture and robust design to evaluate the impact of manufacturing variability on the aerodynamics of compressor blades efficiently and accurately. In the paper, a novel quadratic curve approximation method based on the scanning points of blade design profiles was introduced and combined with Karhunen–Loève expansion, a mathematical dimensionality reduction method for modeling manufacturing variability as truncated Normal process was proposed. Subsequently, Sparse Approximation of Moment-based Arbitrary Polynomial Chaos (SAMBA PC) and computational fluid dynamics (CFD) were applied to build a computational framework for stochastic aerodynamic analysis considering manufacturing variability. Finally, the framework was adopted to evaluate the aerodynamic variations of a high subsonic compressor cascade under the design incidence. The results illustrate that the SAMBA PC method is more efficient than the traditional methods such as Monte Carlo simulation (MCS) for stochastic aerodynamic analysis. Through uncertainty quantification, the impact of manufacturing variability on the global aerodynamic performance is primarily reflected in the fluctuation of aerodynamic losses, and the fluctuation of the total losses is mainly contributed by the fluctuation of the separation loss after the suction peak (a negative pressure spike near the leading edge (LE)) and the boundary-layer loss on the suction surface (SS). With sensitivity analysis, the most important geometric modes to aerodynamics can be revealed, which provides a useful reference for manufacturing inspection process and helps reduce computational cost in robust design.


2021 ◽  
Vol ahead-of-print (ahead-of-print) ◽  
Author(s):  
Heba M. Ezzat

PurposeSince the beginning of 2020, economies faced many changes as a result of coronavirus disease 2019 (COVID-19) pandemic. The effect of COVID-19 on the Egyptian Exchange (EGX) is investigated in this research.Design/methodology/approachTo explore the impact of COVID-19, three periods were considered: (1) 17 months before the spread of COVID-19 and the start of the lockdown, (2) 17 months after the spread of COVID-19 and the during the lockdown and (3) 34 months comprehending the whole period (before and during COVID-19). Due to the large number of variables that could be considered, dimensionality reduction method, such as the principal component analysis (PCA) is followed. This method helps in determining the most individual stocks contributing to the main EGX index (EGX 30). The PCA, also, addresses the multicollinearity between the variables under investigation. Additionally, a principal component regression (PCR) model is developed to predict the future behavior of the EGX 30.FindingsThe results demonstrate that the first three principal components (PCs) could be considered to explain 89%, 85%, and 88% of data variability at (1) before COVID-19, (2) during COVID-19 and (3) the whole period, respectively. Furthermore, sectors of food and beverage, basic resources and real estate have not been affected by the COVID-19. The resulted Principal Component Regression (PCR) model performs very well. This could be concluded by comparing the observed values of EGX 30 with the predicted ones (R-squared estimated as 0.99).Originality/valueTo the best of our knowledge, no research has been conducted to investigate the effect of the COVID-19 on the EGX following an unsupervised machine learning method.


2021 ◽  
Vol 2021 ◽  
pp. 1-9
Author(s):  
Yongbin Liu ◽  
Jingjie Wang ◽  
Wei Bai

Dimensionality reduction of images with high-dimensional nonlinear structure is the key to improving the recognition rate. Although some traditional algorithms have achieved some results in the process of dimensionality reduction, they also expose their respective defects. In order to achieve the ideal effect of high-dimensional nonlinear image recognition, based on the analysis of the traditional dimensionality reduction algorithm and refining its advantages, an image recognition technology based on the nonlinear dimensionality reduction method is proposed. As an effective nonlinear feature extraction method, the nonlinear dimensionality reduction method can find the nonlinear structure of datasets and maintain the intrinsic structure of data. Applying the nonlinear dimensionality reduction method to image recognition is to divide the input image into blocks, take it as a dataset in high-dimensional space, reduce the dimension of its structure, and obtain the low-dimensional expression vector of its eigenstructure so that the problem of image recognition can be carried out in a lower dimension. Thus, the computational complexity can be reduced, the recognition accuracy can be improved, and it is convenient for further processing such as image recognition and search. The defects of traditional algorithms are solved, and the commodity price recognition and simulation experiments are carried out, which verifies the feasibility of image recognition technology based on the nonlinear dimensionality reduction method in commodity price recognition.


2021 ◽  
Vol 11 (23) ◽  
pp. 11156
Author(s):  
Dieter Bender ◽  
Daniel J. Licht ◽  
C. Nataraj

This paper is concerned with the prediction of the occurrence of periventricular leukomalacia (PVL) in neonates after heart surgery. Our prior work shows that the Support Vector Machine (SVM) classifier can be a powerful tool in predicting clinical outcomes of such complicated and uncommon diseases, even when the number of data samples is low. In the presented work, we first illustrate and discuss the shortcomings of the traditional automatic machine learning (aML) approach. Consequently, we describe our methodology for addressing these shortcomings, while utilizing the designed interactive ML (iML) algorithm. Finally, we conclude with a discussion of the developed method and the results obtained. In sum, by adding an additional (Genetic Algorithm) optimization step in the SVM learning framework, we were able to (a) reduce the dimensionality of an SVM model from 248 to 53 features, (b) increase generalization that was confirmed by a 100% accuracy assessed on an unseen testing set, and (c) improve the overall SVM model’s performance from 65% to 100% testing accuracy, utilizing the proposed iML method.


Author(s):  
Baiting Zhao ◽  
Xiao Dong ◽  
Yongcun Guo ◽  
Xiaofen Jia ◽  
Yourui Huang

2021 ◽  
Vol 15 ◽  
Author(s):  
Jiasong Wu ◽  
Xiang Qiu ◽  
Jing Zhang ◽  
Fuzhi Wu ◽  
Youyong Kong ◽  
...  

Generative adversarial networks and variational autoencoders (VAEs) provide impressive image generation from Gaussian white noise, but both are difficult to train, since they need a generator (or encoder) and a discriminator (or decoder) to be trained simultaneously, which can easily lead to unstable training. To solve or alleviate these synchronous training problems of generative adversarial networks (GANs) and VAEs, researchers recently proposed generative scattering networks (GSNs), which use wavelet scattering networks (ScatNets) as the encoder to obtain features (or ScatNet embeddings) and convolutional neural networks (CNNs) as the decoder to generate an image. The advantage of GSNs is that the parameters of ScatNets do not need to be learned, while the disadvantage of GSNs is that their ability to obtain representations of ScatNets is slightly weaker than that of CNNs. In addition, the dimensionality reduction method of principal component analysis (PCA) can easily lead to overfitting in the training of GSNs and, therefore, affect the quality of generated images in the testing process. To further improve the quality of generated images while keeping the advantages of GSNs, this study proposes generative fractional scattering networks (GFRSNs), which use more expressive fractional wavelet scattering networks (FrScatNets), instead of ScatNets as the encoder to obtain features (or FrScatNet embeddings) and use similar CNNs of GSNs as the decoder to generate an image. Additionally, this study develops a new dimensionality reduction method named feature-map fusion (FMF) instead of performing PCA to better retain the information of FrScatNets,; it also discusses the effect of image fusion on the quality of the generated image. The experimental results obtained on the CIFAR-10 and CelebA datasets show that the proposed GFRSNs can lead to better generated images than the original GSNs on testing datasets. The experimental results of the proposed GFRSNs with deep convolutional GAN (DCGAN), progressive GAN (PGAN), and CycleGAN are also given.


2021 ◽  
Vol 11 (1) ◽  
Author(s):  
Zixiang Luo ◽  
Chenyu Xu ◽  
Zhen Zhang ◽  
Wenfei Jin

AbstractDimensionality reduction is crucial for the visualization and interpretation of the high-dimensional single-cell RNA sequencing (scRNA-seq) data. However, preserving topological structure among cells to low dimensional space remains a challenge. Here, we present the single-cell graph autoencoder (scGAE), a dimensionality reduction method that preserves topological structure in scRNA-seq data. scGAE builds a cell graph and uses a multitask-oriented graph autoencoder to preserve topological structure information and feature information in scRNA-seq data simultaneously. We further extended scGAE for scRNA-seq data visualization, clustering, and trajectory inference. Analyses of simulated data showed that scGAE accurately reconstructs developmental trajectory and separates discrete cell clusters under different scenarios, outperforming recently developed deep learning methods. Furthermore, implementation of scGAE on empirical data showed scGAE provided novel insights into cell developmental lineages and preserved inter-cluster distances.


2021 ◽  
Vol 12 ◽  
Author(s):  
Bianca Bianco ◽  
Flavia Altheman Loureiro ◽  
Camila Martins Trevisan ◽  
Carla Peluso ◽  
Denise Maria Christofolini ◽  
...  

BackgroundSingle nucleotide variants (SNVs) FSHB:c.-211G>T, FSHR:c.919G>A, and FSHR:c.2039G>A were reported to be associated with the variability in FSH and LH levels, and in vitro fertilization (IVF) outcomes. In this study, we aimed to evaluate the effects of FSHB:c.-211G>T, FSHR:c.919G>A, and FSHR:c.2039G>A variants, alone and combined, on the hormonal profile and reproduction outcomes of women with endometriosis.MethodsA cross-sectional study was performed comprising 213 infertile Brazilian women with endometriosis who underwent IVF treatment. Genotyping was performed using TaqMan real-time PCR. Variables were compared according to the genotypes of each variant and genetic models, and the combined effects of the SNVs were evaluated using the multifactorial dimensionality reduction method.ResultsFSHB:c.-211G>T affected LH levels in women with overall endometriosis and minimal/mild disease. FSHR:c.919G>A affected FSH levels in women with overall endometriosis and the number of oocytes retrieved in those with moderate/severe endometriosis. Moreover, the FSHR:c.2039G>A affected FSH levels in women with overall endometriosis, LH levels and total amount of rFSH in those with minimal/mild disease, and number of follicles and number of oocytes retrieved in those with moderate/severe endometriosis. No effect on hormone profile or reproductive outcomes was observed when the genotypes were combined.ConclusionsVariants of the FSHB and FSHR genes separately interfered with the hormonal profiles and IVF outcomes of women with endometriosis.


SPIN ◽  
2021 ◽  
pp. 2140002
Author(s):  
Yunkai Wang ◽  
Shengjun Wu

For quantum search via the continuous-time quantum walk, the evolution of the whole system is usually limited in a small subspace. In this paper, we discuss how the symmetries of the graphs are related to the existence of such an invariant subspace, which also suggests a dimensionality reduction method based on group representation theory. We observe that in the one-dimensional subspace spanned by each desired basis state which assembles the identically evolving original basis states, we always get a trivial representation of the symmetry group. So, we could find the desired basis by exploiting the projection operator of the trivial representation. Besides being technical guidance in this type of problem, this discussion also suggests that all the symmetries are used up in the invariant subspace and the asymmetric part of the Hamiltonian is very important for the purpose of quantum search.


Sign in / Sign up

Export Citation Format

Share Document