effective dimensionality
Recently Published Documents


TOTAL DOCUMENTS

82
(FIVE YEARS 23)

H-INDEX

14
(FIVE YEARS 3)

2021 ◽  
Author(s):  
Arpita Joshi ◽  
Nurit Haspel ◽  
Eduardo Gonzalez

Datasets representing the conformational landscapes of protein structures are high dimensional and hence present computational challenges. Efficient and effective dimensionality reduction of these datasets is therefore paramount to our ability to analyze the conformational landscapes of proteins and extract important information regarding protein folding, conformational changes and binding. Representing the structures with fewer attributes that capture the most variance of the data, makes for quicker and precise analysis of these structures. In this work we make use of dimensionality reduction methods for reducing the number of instances and for feature reduction. The reduced dataset that is obtained is then subjected to topological and quantitative analysis. In this step we perform hierarchical clustering to obtain different sets of conformation clusters that may correspond to intermediate structures. The structures represented by these conformations are then analyzed by studying their high dimension topological properties to identify truly distinct conformations and holes in the conformational space that may represent high energy barriers. Our results show that the clusters closely follow known experimental results about intermediate structures, as well as binding and folding events.


2021 ◽  
Author(s):  
Grant E. Haines ◽  
Louis Moisan ◽  
Alison M. Derry ◽  
Andrew P. Hendry

In nature, populations are subjected to a wide variety of environmental conditions that affect fitness and induce adaptive or plastic responses in traits, resulting in phenotypic divergence between populations. The dimensionality of that divergence, however, remains contentious. At the extremes, some contend that populations diverge along a single axis of trait covariance with greatest availability of heritable variation, even if this does not lead a population directly to its fitness optimum. Those at the other extreme argue that selection can push populations towards their fitness optima along multiple phenotype axes simultaneously, resulting in divergence in numerous dimensions. Here, we address this debate using populations of threespine stickleback (Gasterosteus aculeatus) in the Cook Inlet region of southern Alaska from lakes with contrasting ecological conditions. We calculated effective dimensionality of divergence in several trait suites (defensive, swimming, and trophic) thought to be under correlated selection pressures, as well as across all traits. We also tested for integration among the trait suites and between each trait suite and the environment. We found that populations in the Cook Inlet radiation exhibit dimensionality of phenotype high enough to preclude a single axis of divergence.


Sensors ◽  
2021 ◽  
Vol 21 (12) ◽  
pp. 4207
Author(s):  
Luca Rosafalco ◽  
Andrea Manzoni ◽  
Stefano Mariani ◽  
Alberto Corigliano

In civil engineering, different machine learning algorithms have been adopted to process the huge amount of data continuously acquired through sensor networks and solve inverse problems. Challenging issues linked to structural health monitoring or load identification are currently related to big data, consisting of structural vibration recordings shaped as a multivariate time series. Any algorithm should therefore allow an effective dimensionality reduction, retaining the informative content of data and inferring correlations within and across the time series. Within this framework, we propose a time series AutoEncoder (AE) employing inception modules and residual learning for the encoding and the decoding parts, and an extremely reduced latent representation specifically tailored to tackle load identification tasks. We discuss the choice of the dimensionality of this latent representation, considering the sources of variability in the recordings and the inverse-forward nature of the AE. To help setting the aforementioned dimensionality, the false nearest neighbor heuristics is also exploited. The reported numerical results, related to shear buildings excited by dynamic loadings, highlight the signal reconstruction capacity of the proposed AE, and the capability to accomplish the load identification task.


Molecules ◽  
2021 ◽  
Vol 26 (7) ◽  
pp. 2065
Author(s):  
Aditya Divyakant Shrivastava ◽  
Douglas B. Kell

The question of molecular similarity is core in cheminformatics and is usually assessed via a pairwise comparison based on vectors of properties or molecular fingerprints. We recently exploited variational autoencoders to embed 6M molecules in a chemical space, such that their (Euclidean) distance within the latent space so formed could be assessed within the framework of the entire molecular set. However, the standard objective function used did not seek to manipulate the latent space so as to cluster the molecules based on any perceived similarity. Using a set of some 160,000 molecules of biological relevance, we here bring together three modern elements of deep learning to create a novel and disentangled latent space, viz transformers, contrastive learning, and an embedded autoencoder. The effective dimensionality of the latent space was varied such that clear separation of individual types of molecules could be observed within individual dimensions of the latent space. The capacity of the network was such that many dimensions were not populated at all. As before, we assessed the utility of the representation by comparing clozapine with its near neighbors, and we also did the same for various antibiotics related to flucloxacillin. Transformers, especially when as here coupled with contrastive learning, effectively provide one-shot learning and lead to a successful and disentangled representation of molecular latent spaces that at once uses the entire training set in their construction while allowing “similar” molecules to cluster together in an effective and interpretable way.


2021 ◽  
Vol 13 (7) ◽  
pp. 1363
Author(s):  
Guangyao Shi ◽  
Fulin Luo ◽  
Yiming Tang ◽  
Yuan Li

Graph learning is an effective dimensionality reduction (DR) manner to analyze the intrinsic properties of high dimensional data, it has been widely used in the fields of DR for hyperspectral image (HSI) data, but they ignore the collaborative relationship between sample pairs. In this paper, a novel supervised spectral DR method called local constrained manifold structure collaborative preserving embedding (LMSCPE) was proposed for HSI classification. At first, a novel local constrained collaborative representation (CR) model is designed based on the CR theory, which can obtain more effective collaborative coefficients to characterize the relationship between samples pairs. Then, an intraclass collaborative graph and an interclass collaborative graph are constructed to enhance the intraclass compactness and the interclass separability, and a local neighborhood graph is constructed to preserve the local neighborhood structure of HSI. Finally, an optimal objective function is designed to obtain a discriminant projection matrix, and the discriminative features of various land cover types can be obtained. LMSCPE can characterize the collaborative relationship between sample pairs and explore the intrinsic geometric structure in HSI. Experiments on three benchmark HSI data sets show that the proposed LMSCPE method is superior to the state-of-the-art DR methods for HSI classification.


2020 ◽  
Author(s):  
Stefano Recanatesi ◽  
Serena Bradde ◽  
Vijay Balasubramanian ◽  
Nicholas A. Steinmetz ◽  
Eric Shea-Brown

A fundamental problem in science is uncovering the effective number of dynamical degrees of freedom in a complex system, a quantity that depends on the spatio-temporal scale at which the system is observed. Here, we propose a scale-dependent generalization of a classic enumeration of latent variables, the Participation Ratio. We show how this measure relates to conventional quantities such as the Correlation dimension and Principal Component Analysis, and demonstrate its properties in dynamical systems such as the Lorentz attractor. We apply the method to neural population recordings in multiple brain areas and brain states, and demonstrate fundamental differences in the effective dimensionality of neural activity in behaviorally engaged states versus spontaneous activity. Our method applies broadly to multi-variate data across fields of science.


Author(s):  
Yu Hu ◽  
Haim Sompolinsky

AbstractA key question in theoretical neuroscience is the relation between connectivity structure and collective dynamics of a network of neurons. Here we study the connectivity-dynamics relation as reflected in the distribution of eigenvalues of the covariance matrix of the dynamic fluctuations of the neuronal activities, which is closely related to the network’s Principal Component Analysis (PCA) and the associated effective dimensionality. We consider the spontaneous fluctuations around a steady state in randomly connected recurrent network of spiking neurons. We derive an exact analytical expression for the covariance eigenvalue distribution in the large network limit. We show analytically that the distribution has a finitely supported smooth bulk spectrum, and exhibits an approximate power law tail for coupling matrices near the critical edge. Effects of adding connectivity motifs and extensions to EI networks are also discussed. Our results suggest that the covariance spectrum is a robust feature of population dynamics in recurrent neural circuits and provide a theoretical predictions for this spectrum in simple connectivity models that can be compared with experimental data.


2020 ◽  
Author(s):  
F. Robin O’Keefe

AbstractThe study of modularity in geometric morphometric landmark data has focused attention on an underlying question, that of whole-shape modularity, or the pattern and strength of covariation among all landmarks. Measuring whole-shape modularity allows measurement of the dimensionality of the shape, but current methods used to measure this dimensionality are limited in application. This paper proposes a metric for measuring the “effective dimensionality”, De, of geometric morphometric landmark data based on the Shannon entropy of the eigenvalue vector of the covariance matrix of GPA landmark data. A permutation test to establish null rank deficiency is developed to allow standardization for comparing dimensionality metrics between data sets, and a bootstrap test is employed for measures of dispersion. These novel methods are applied to a data set of 14 landmarks taken from 119 dire wolf jaws from Rancho La Brea. Comparison with the current test based on eigenvalue dispersion demonstrates that the new metric is more sensitive to detecting population differences in whole-shape modularity. The effective dimensionality metric is extended, in the dense semilandmark case, to a measure of “latent dimensionality”, Dl. Latent dimensionality should be comparable among landmark spaces, whether they are homologous or not.


Sign in / Sign up

Export Citation Format

Share Document