Model-based ordination of pin-point cover data: effect of management on dry heathland

AbstractRecently, there has been an increasing interest in model-based approaches for the statistical modelling of the joint distribution of multi-species abundances. The Dirichlet-multinomial distribution has been proposed as a suitable candidate distribution for the joint species distribution of pin-point plant cover data and is here applied in a model-based ordination framework. Unlike most model-based ordination methods, both fixed and random effects are in our proposed model structured as p-dimensional vectors and added to the latent variables before the inner product with the species-specific coefficients. This changes the interpretation of the parameters, so that the fixed and random effects now measure the relative displacement of the vegetation by the fixed and random factors in the p-dimensional latent variable space. This parameterization allows statistical inference of the effect of fixed and random factors in vector space, and makes it easier for practitioners to perform inferences on species composition in a multivariate setting. The method was applied on plant pin-point cover data from dry heathlands that had received different management treatments (burned, grazed, harvested, unmanaged), and it was found that treatment have a significant effect on heathland vegetation both when considering plant functional groups or when the taxonomic resolution was at the species level.

Download Full-text

Fast model-based ordination with copulas

10.1101/2021.03.28.437086 ◽

2021 ◽

Author(s):

Gordana C. Popovic ◽

Francis K.C. Hui ◽

David I. Warton

Keyword(s):

Latent Variables ◽

Latent Variable ◽

Current Model ◽

Sample Sizes ◽

Major Drawback ◽

Large Sample ◽

Model Based ◽

Ordination Methods ◽

Order Of Magnitude ◽

Taxonomic Groups

Visualising data is a vital part of analysis, allowing researchers to find patterns, and assess and communicate the results of statistical modeling. In ecology, visualisation is often challenging when there are many variables (often for different species or other taxonomic groups) and they are not normally distributed (often counts or presence-absence data). Ordination is a common and powerful way to overcome this hurdle by reducing data from many response variables to just two or three, to be easily plotted. Ordination is traditionally done using dissimilarity-based methods, most commonly non-metric multidimensional scaling (nMDS). In the last decade however, model-based methods for unconstrained ordination have gained popularity. These are primarily based on latent variable models, with latent variables estimating the underlying, unobserved ecological gradients. Despite some major benefits, a major drawback of model-based ordination methods is their speed, as they typically taking much longer to return a result than dissimilarity-based methods, especially for large sample sizes. We introduce copula ordination, a new, scalable model-based approach to unconstrained ordination. This method has all the desirable properties of model-based ordination methods, with the added advantage that it is computationally far more efficient. In particular, simulations show copula ordination is an order of magnitude faster than current model-based methods, and can even be faster than nMDS for large sample sizes, while being able to produce similar ordination plots and trends as these methods.

Download Full-text

Model-based ordination with constrained latent variables

10.1101/2021.10.11.463884 ◽

2021 ◽

Author(s):

Bert van der Veen ◽

Francis K.C. Hui ◽

Knut A. Hovstad ◽

Robert B. O’Hara

Keyword(s):

Species Composition ◽

Latent Variables ◽

Latent Variable ◽

List Type ◽

Ecological Gradients ◽

Model Framework ◽

Variable Model ◽

Response Data ◽

Model Based ◽

Constrained Ordination

SummaryIn community ecology, unconstrained ordination can be used to predict latent variables from a multivariate dataset, which generated the observed species composition.Latent variables can be understood as ecological gradients, which are represented as a function of measured predictors in constrained ordination, so that ecologists can better relate species composition to the environment while reducing dimensionality of the predictors and the response data.However, existing constrained ordination methods do not explicitly account for information provided by species responses, so that they have the potential to misrepresent community structure if not all predictors are measured.We propose a new method for model-based ordination with constrained latent variables in the Generalized Linear Latent Variable Model framework, which incorporates both measured predictors and residual covariation to optimally represent ecological gradients. Simulations of unconstrained and constrained ordination show that the proposed method outperforms CCA and RDA.

Download Full-text

Bilinear regression with random effects and reduced rank restrictions

Japanese Journal of Statistics and Data Science ◽

10.1007/s42081-019-00050-2 ◽

2019 ◽

Vol 3 (1) ◽

pp. 63-72

Author(s):

Tatjana von Rosen ◽

Dietrich von Rosen

Keyword(s):

Random Effects ◽

Latent Variables ◽

Latent Variable ◽

Fixed Effects ◽

Likelihood Function ◽

Complete Solution ◽

Reduced Rank ◽

Bilinear Models ◽

Rank Restrictions ◽

Explicit Estimators

AbstractBilinear models with three types of effects are considered: fixed effects, random effects and latent variable effects. In the literature, bilinear models with random effects and bilinear models with latent variables have been discussed but there are no results available when combining random effects and latent variables. It is shown, via appropriate vector space decompositions, how to remove the random effects so that a well-known model comprising only fixed effects and latent variables is obtained. The spaces are chosen so that the likelihood function can be factored in a convenient and interpretable way. To obtain explicit estimators, an important standardization constraint on the random effects is assumed to hold. A theorem is presented where a complete solution to the estimation problem is given.

Download Full-text

Bayesian multilevel structural equation modeling: An investigation into robust prior distributions.

10.31219/osf.io/fzyst ◽

2020 ◽

Author(s):

Sara van Erp ◽

William Browne

Keyword(s):

Random Effects ◽

Latent Variables ◽

Latent Variable ◽

Structural Equation ◽

Structural Equation Models ◽

Equation Modeling ◽

Prior Distributions ◽

Inverse Gamma ◽

Variance Parameters ◽

Multilevel Structural Equation

Bayesian estimation of multilevel structural equation models (MLSEMs) offers advantages in terms of sample size requirements and computational feasibility, but does require careful specification of the prior distribution especially for the random effects variance parameters. The traditional “non-informative” conjugate choice of an inverse- Gamma prior with small hyperparameters has been shown time and again to be problematic. In this paper, we investigate alternative, more robust prior distributions. In contrast to multilevel models without latent variables, MLSEMs have multiple random effects variance parameters, both for the multilevel structure and for the latent variable structure. It is therefore even more important to construct reasonable priors for these parameters. We find that, although the robust priors outperform the traditional inverse-Gamma prior, their hyperparameters do require careful consideration.

Download Full-text

Multi-Partitions Subspace Clustering

Mathematics ◽

10.3390/math8040597 ◽

2020 ◽

Vol 8 (4) ◽

pp. 597 ◽

Cited By ~ 1

Author(s):

Vincent Vandewalle

Keyword(s):

Latent Variables ◽

Latent Variable ◽

Subspace Clustering ◽

Real Data ◽

Model Choice ◽

Model Based Clustering ◽

Model Based ◽

Choice Strategy ◽

Factorial Discriminant Analysis ◽

Bic Criterion

In model based clustering, it is often supposed that only one clustering latent variable explains the heterogeneity of the whole dataset. However, in many cases several latent variables could explain the heterogeneity of the data at hand. Finding such class variables could result in a richer interpretation of the data. In the continuous data setting, a multi-partition model based clustering is proposed. It assumes the existence of several latent clustering variables, each one explaining the heterogeneity of the data with respect to some clustering subspace. It allows to simultaneously find the multi-partitions and the related subspaces. Parameters of the model are estimated through an EM algorithm relying on a probabilistic reinterpretation of the factorial discriminant analysis. A model choice strategy relying on the BIC criterion is proposed to select to number of subspaces and the number of clusters by subspace. The obtained results are thus several projections of the data, each one conveying its own clustering of the data. Model’s behavior is illustrated on simulated and real data.

Download Full-text

Probability-Based and Measurement-Related Hypotheses With Full Restriction for Investigations by Means of Confirmatory Factor Analysis

Methodology ◽

10.1027/1614-2241/a000033 ◽

2011 ◽

Vol 7 (4) ◽

pp. 157-164

Author(s):

Karl Schweizer

Keyword(s):

Factor Analysis ◽

Confirmatory Factor Analysis ◽

Cognitive Processing ◽

Latent Variables ◽

Repeated Measures ◽

Latent Variable ◽

Model Fit ◽

Repeated Measures Data ◽

Confirmatory Factor ◽

And Performance

Probability-based and measurement-related hypotheses for confirmatory factor analysis of repeated-measures data are investigated. Such hypotheses comprise precise assumptions concerning the relationships among the true components associated with the levels of the design or the items of the measure. Measurement-related hypotheses concentrate on the assumed processes, as, for example, transformation and memory processes, and represent treatment-dependent differences in processing. In contrast, probability-based hypotheses provide the opportunity to consider probabilities as outcome predictions that summarize the effects of various influences. The prediction of performance guided by inexact cues serves as an example. In the empirical part of this paper probability-based and measurement-related hypotheses are applied to working-memory data. Latent variables according to both hypotheses contribute to a good model fit. The best model fit is achieved for the model including latent variables that represented serial cognitive processing and performance according to inexact cues in combination with a latent variable for subsidiary processes.

Download Full-text

Conceptualizing Protective Family Context and Its effect on Substance Use: Comparisons Across Diverse Ethnic-Racial Youth

10.31234/osf.io/abfs3 ◽

2019 ◽

Author(s):

Kevin Constante ◽

Edward Huntley ◽

Emma Schillinger ◽

Christine Wagner ◽

Daniel Keating

Keyword(s):

Substance Use ◽

Measurement Invariance ◽

Latent Variables ◽

Latent Variable ◽

Path Model ◽

Family Context ◽

Partial Metric ◽

Racial Groups ◽

Protective Methods ◽

Family Variables

Background: Although family behaviors are known to be important for buffering youth against substance use, research in this area often evaluates a particular type of family interaction and how it shapes adolescents’ behaviors, when it is likely that youth experience the co-occurrence of multiple types of family behaviors that may be protective. Methods: The current study (N = 1716, 10th and 12th graders, 55% female) examined associations between protective family context, a latent variable comprised of five different measures of family behaviors, and past 12 months substance use: alcohol, cigarettes, marijuana, and e-cigarettes. Results: A multi-group measurement invariance assessment supported protective family context as a coherent latent construct with partial (metric) measurement invariance among Black, Latinx, and White youth. A multi-group path model indicated that protective family context was significantly associated with less substance use for all youth, but of varying magnitudes across ethnic-racial groups. Conclusion: These results emphasize the importance of evaluating psychometric properties of family-relevant latent variables on the basis of group membership in order to draw appropriate inferences on how such family variables relate to substance use among diverse samples.

Download Full-text

Inference about the fixed and random effects in a mixed-effects linear model: an approximate Bayesian approach

10.31274/rtd-180813-11736 ◽

1993 ◽

Author(s):

Alan George Zimmermann

Keyword(s):

Linear Model ◽

Random Effects ◽

Bayesian Approach ◽

Mixed Effects ◽

Fixed And Random Effects ◽

Approximate Bayesian

Download Full-text

Interpretable Variational Graph Autoencoder with Noninformative Prior

Future Internet ◽

10.3390/fi13020051 ◽

2021 ◽

Vol 13 (2) ◽

pp. 51

Author(s):

Lili Sun ◽

Xueyan Liu ◽

Min Zhao ◽

Bo Yang

Keyword(s):

Latent Variables ◽

Latent Variable ◽

Expert Knowledge ◽

Structural Information ◽

Standard Normal Distribution ◽

Noninformative Prior ◽

Latent Space ◽

Distribution Parameters ◽

Standard Normal ◽

Low Dimensional

Variational graph autoencoder, which can encode structural information and attribute information in the graph into low-dimensional representations, has become a powerful method for studying graph-structured data. However, most existing methods based on variational (graph) autoencoder assume that the prior of latent variables obeys the standard normal distribution which encourages all nodes to gather around 0. That leads to the inability to fully utilize the latent space. Therefore, it becomes a challenge on how to choose a suitable prior without incorporating additional expert knowledge. Given this, we propose a novel noninformative prior-based interpretable variational graph autoencoder (NPIVGAE). Specifically, we exploit the noninformative prior as the prior distribution of latent variables. This prior enables the posterior distribution parameters to be almost learned from the sample data. Furthermore, we regard each dimension of a latent variable as the probability that the node belongs to each block, thereby improving the interpretability of the model. The correlation within and between blocks is described by a block–block correlation matrix. We compare our model with state-of-the-art methods on three real datasets, verifying its effectiveness and superiority.

Download Full-text

Assessing the impacts of Airbnb listings on London house prices

Environment and Planning B Urban Analytics and City Science ◽

10.1177/23998083211001836 ◽

2021 ◽

pp. 239980832110018

Author(s):

James Todd ◽

Anwar Musah ◽

James Cheshire

Keyword(s):

Random Effects ◽

Rapid Growth ◽

House Prices ◽

Housing Prices ◽

House Price ◽

Positive Association ◽

Sharing Economy ◽

Local Context ◽

Fixed And Random Effects ◽

The World

Over the course of the last decade, sharing economy platforms have experienced significant growth within cities around the world. Airbnb, which is one of the largest and best-known platforms, provides the focus for this paper and offers a service that allows users to rent properties or spare rooms to guests. Its rapid growth has led to a growing discourse around the consequences of Airbnb rentals within the local context. The research within this paper focuses on determining impact on local housing prices within the inner London boroughs by constructing a longitudinal panel dataset, on which a fixed and random effects regression was conducted. The results indicate that there is a significant and modest positive association between the frequency of Airbnb and the house price per square metre in these boroughs.

Download Full-text