The application of a graph-partitioning algorithm to the scheduling of curriculum requirements

AbstractThe problem of pattern and scale is a central challenge in ecology. The problem of scale is central to community ecology, where functional ecological groups are aggregated and treated as a unit underlying an ecological pattern, such as aggregation of “nitrogen fixing trees” into a total abundance of a trait underlying ecosystem physiology. With the emergence of massive community ecological datasets, from microbiomes to breeding bird surveys, there is a need to objectively identify the scales of organization pertaining to well-defined patterns in community ecological data.The phylogeny is a scaffold for identifying key phylogenetic scales associated with macroscopic patterns. Phylofactorization was developed to objectively identify phylogenetic scales underlying patterns in relative abundance data. However, many ecological data, such as presence-absences and counts, are not relative abundances, yet it is still desireable and informative to identify phylogenetic scales underlying a pattern of interest. Here, we generalize phylofactorization beyond relative abundances to a graph-partitioning algorithm for any community ecological data.Generalizing phylofactorization connects many tools from data analysis to phylogenetically-informe analysis of community ecological data. Two-sample tests identify three phylogenetic factors of mammalian body mass which arose during the K-Pg extinction event, consistent with other analyses of mammalian body mass evolution. Projection of data onto coordinates defined by the phylogeny yield a phylogenetic principal components analysis which refines our understanding of the major sources of variation in the human gut microbiome. These same coordinates allow generalized additive modeling of microbes in Central Park soils and confirm that a large clade of Acidobacteria thrive in neutral soils. Generalized linear and additive modeling of exponential family random variables can be performed by phylogenetically-constrained reduced-rank regression or stepwise factor contrasts. We finish with a discussion of how phylofac-torization produces an ecological species concept with a phylogenetic constraint. All of these tools can be implemented with a new R package available online.

Download Full-text

Phylofactorization - theory and challenges

10.1101/196378 ◽

2017 ◽

Cited By ~ 2

Author(s):

Alex D. Washburne

Keyword(s):

Graph Partitioning ◽

Latent Variable ◽

Hierarchical Regression ◽

Test Statistics ◽

Objective Functions ◽

Biological Communities ◽

Null Distributions ◽

Number Of Factors ◽

Partitioning Algorithm ◽

Log Ratio

AbstractData from biological communities are composed of species connected by the phylogeny. A greedy algorithm ‘phylofactorization’ - was developed to construct an isometric log-ratio transform whose balances correspond to edges along which traits arose, controlling for previously made inferences.In this paper, the general theory of phylofactorization is presented as a graph-partitioning algorithm. A special case-regression phylofactorization-chooses coordinates based on sequential maximization of objective functions from regression on “contrast” variables such as an isometric log-ratio transform. The connections between regression phylofactorization and other methods is discussed, including matrix factorization, hierarchical regression, factor analysis and latent variable models. Open challenges in the statistical analysis of phylofactorization are presented, including criteria for choosing the number of factors and approximating null-distributions of commonly used test statistics and objective functions. As a graph-partitioning algorithm, cross-validation of phylo factorization across datasets requires graph-topological considerations, such as how to deal with novel nodes and edges and whether or not to control for partition order. Overcoming these challenges can accelerate our analysis of phylogenetically-structured data and allow annotations of edges in an online tree of life.

Download Full-text