scholarly journals Analyzing matched sets of microbiome data using the LDM and PERMANOVA

2020 ◽  
Author(s):  
Zhengyi Zhu ◽  
Glen A. Satten ◽  
Caroline Mitchell ◽  
Yi-Juan Hu

AbstractBackgroundMatched-set data arise frequently in microbiome studies. For example, we may collect pre- and post-treatment samples from a set of individuals, or use important confounding variables to match data from case participants to one or more control participants. Thus, there is a need for statistical methods for data comprised of matched sets, to test hypotheses against traits of interest (e.g., clinical outcomes or environmental factors) at the community level and/or the OTU (operational taxonomic unit) level. Optimally, these methods should accommodate complex data such as those with unequal sample sizes cross sets, confounders varying within sets, as well as continuous traits of interest.MethodsPERMANOVA is a commonly used distance-based method for testing hypotheses at the community level. We have also developed the linear decomposition model (LDM) that unifies the community-level and OTU-level tests into one framework. Here we present a strategy that can be used with both PERMANOVA and the LDM for analyzing matched-set data. We propose to include an indicator variable for each set as covariates, so as to constrain comparisons between samples within a set, and also permute traits within each set, which can account for exchangeable sample correlations. The flexible nature of PERMANOVA and the LDM allows discrete or continuous traits or interactions to be tested, within-set confounders to be adjusted, and unbalanced data to be fully exploited.ResultsOur simulations indicate that our proposed strategy outperformed alternative strategies in a wide range of scenarios. Using simulation, we also explored optimal designs for matched-set studies. The flexibility of PERMANOVA and the LDM for a variety of matched-set microbiome data is illustrated by the analysis of data from two real studies.ConclusionsIncluding set indicator variables and permuting within sets when analyzing matched-set data with PERMANOVA or the LDM is a strategy that performs well and is capable of handling the complex data structures that frequently occur in microbiome studies.

2020 ◽  
Author(s):  
Zhengyi Zhu ◽  
Glen Satten ◽  
Caroline Mitchell ◽  
Yi-Juan Hu

Abstract Background: Matched-set data arise frequently in microbiome studies. For example, we may collect pre- and post-treatment samples from a set of individuals, or use important confounding variables to match data from case participants to one or more control participants. Thus, there is a need for statistical methods for data comprised of matched sets, to test hypotheses against traits of interest (e.g., clinical outcomes or environmental factors) at the community level and/or the OTU (operational taxonomic unit) level. Optimally, these methods should accommodate complex data such as those with unequal sample sizes cross sets, confounders varying within sets, as well as continuous traits of interest. Methods: PERMANOVA is a commonly used distance-based method for testing hypotheses at the community level. We have also developed the linear decomposition model (LDM) that unifies the community-level and OTU-level tests into one framework. Here we present a strategy that can be used with both PERMANOVA and the LDM for analyzing matched-set data. We propose to include an indicator variable for each set as covariates, so as to constrain comparisons between samples within a set, and also permute traits within each set, which can account for exchangeable sample correlations. The flexible nature of PERMANOVA and the LDM allows discrete or continuous traits or interactions to be tested, within-set confounders to be adjusted, and unbalanced data to be fully exploited. Results: Our simulations indicate that our proposed strategy outperformed alternative strategies in a wide range of scenarios. Using simulation, we also explored optimal designs for matched-set studies. The flexibility of PERMANOVA and the LDM for a variety of matched-set microbiome data is illustrated by the analysis of data from two real studies. Conclusions: Including set indicator variables and permuting within sets when analyzing matched-set data with PERMANOVA or the LDM is a strategy that performs well and is capable of handling the complex data structures that frequently occur in microbiome studies.


2020 ◽  
Author(s):  
Zhengyi Zhu ◽  
Glen A Satten ◽  
Caroline Mitchell ◽  
Yi-Juan Hu

Abstract Background: Matched-set data arise frequently in microbiome studies. For example, we may collect pre- and post-treatment samples from a set of individuals, or use important confounding variables to match data from case participants to one or more control participants. Thus, there is a need for statistical methods for data comprised of matched sets, to test hypotheses against traits of interest (e.g., clinical outcomes or environmental factors) at the community level and/or the OTU (operational taxonomic unit) level. Optimally, these methods should accommodate complex data such as those with unequal sample sizes cross sets, confounders varying within sets, as well as continuous traits of interest. Methods: PERMANOVA is a commonly used distance-based method for testing hypotheses at the community level. We have also developed the linear decomposition model (LDM) that unifies the community-level and OTU-level tests into one framework. Here we present a new strategy that can be used with both PERMANOVA and the LDM for analyzing matched-set data. We propose to include an indicator variable for each set as covariates, so as to constrain comparisons between samples within a set, and also permute traits within each set, which can account for exchangeable sample correlations. The flexible nature of PERMANOVA and the LDM allows discrete or continuous traits or interactions to be tested, within-set confounders to be adjusted, and unbalanced data to be fully exploited. Results: Our simulations indicate that our proposed strategy outperformed alternative strategies, including the commonly-used one that utilizes restricted permutation only, in a wide range of scenarios. Using simulation, we also explored optimal designs for matched-set studies. The flexibility of PERMANOVA and the LDM for a variety of matched-set microbiome data is illustrated by the analysis of data from two real studies. Conclusions: Including set indicator variables and permuting within sets when analyzing matched-set data with PERMANOVA or the LDM is a strategy that performs well and is capable of handling the complex data structures that frequently occur in microbiome studies.


Microbiome ◽  
2021 ◽  
Vol 9 (1) ◽  
Author(s):  
Zhengyi Zhu ◽  
Glen A. Satten ◽  
Caroline Mitchell ◽  
Yi-Juan Hu

Abstract Background Matched-set data arise frequently in microbiome studies. For example, we may collect pre- and post-treatment samples from a set of individuals, or use important confounding variables to match data from case participants to one or more control participants. Thus, there is a need for statistical methods for data comprised of matched sets, to test hypotheses against traits of interest (e.g., clinical outcomes or environmental factors) at the community level and/or the operational taxonomic unit (OTU) level. Optimally, these methods should accommodate complex data such as those with unequal sample sizes across sets, confounders varying within sets, and continuous traits of interest. Methods PERMANOVA is a commonly used distance-based method for testing hypotheses at the community level. We have also developed the linear decomposition model (LDM) that unifies the community-level and OTU-level tests into one framework. Here we present a new strategy that can be used with both PERMANOVA and the LDM for analyzing matched-set data. We propose to include an indicator variable for each set as covariates, so as to constrain comparisons between samples within a set, and also permute traits within each set, which can account for exchangeable sample correlations. The flexible nature of PERMANOVA and the LDM allows discrete or continuous traits or interactions to be tested, within-set confounders to be adjusted, and unbalanced data to be fully exploited. Results Our simulations indicate that our proposed strategy outperformed alternative strategies, including the commonly used one that utilizes restricted permutation only, in a wide range of scenarios. Using simulation, we also explored optimal designs for matched-set studies. The flexibility of PERMANOVA and the LDM for a variety of matched-set microbiome data is illustrated by the analysis of data from two real studies. Conclusions Including set indicator variables and permuting within sets when analyzing matched-set data with PERMANOVA or the LDM is a strategy that performs well and is capable of handling the complex data structures that frequently occur in microbiome studies.


2020 ◽  
Author(s):  
Zhasmina Tacheva ◽  
Anton Ivanov

BACKGROUND Opioid-related deaths constitute a problem of pandemic proportions in the United States, with no clear solution in sight. Although addressing addiction—the heart of this problem—ought to remain a priority for health practitioners, examining the community-level psychological factors with a known impact on health behaviors may provide valuable insights for attenuating this health crisis by curbing risky behaviors before they evolve into addiction. OBJECTIVE The goal of this study is twofold: to demonstrate the relationship between community-level psychological traits and fatal opioid overdose both theoretically and empirically, and to provide a blueprint for using social media data to glean these psychological factors in a real-time, reliable, and scalable manner. METHODS We collected annual panel data from Twitter for 2891 counties in the United States between 2014-2016 and used a novel data mining technique to obtain average county-level “Big Five” psychological trait scores. We then performed interval regression, using a control function to alleviate omitted variable bias, to empirically test the relationship between county-level psychological traits and the prevalence of fatal opioid overdoses in each county. RESULTS After controlling for a wide range of community-level biopsychosocial factors related to health outcomes, we found that three of the operationalizations of the five psychological traits examined at the community level in the study were significantly associated with fatal opioid overdoses: extraversion (β=.308, <i>P</i>&lt;.001), neuroticism (β=.248, <i>P</i>&lt;.001), and conscientiousness (β=.229, <i>P</i>&lt;.001). CONCLUSIONS Analyzing the psychological characteristics of a community can be a valuable tool in the local, state, and national fight against the opioid pandemic. Health providers and community health organizations can benefit from this research by evaluating the psychological profile of the communities they serve and assessing the projected risk of fatal opioid overdose based on the relationships our study predict when making decisions for the allocation of overdose-reversal medication and other vital resources.


2020 ◽  
Vol 3 (1) ◽  
pp. 29-37
Author(s):  
Tryas Wardani Nurwan ◽  
Helmi Hasan

The purpose of the study was to determine the effect of individual characteristic toward benefit recipients’ participation of Program Keluarga Harapan (PKH) in Nagari Pematang Panjang, Sijunjung District, West Sumatera. This study used quantitative method with a questionnaire and data analysis using SPSS 21. Based on Slovin’s theory, the respondents in this study were 131 from the 194 benefit recipients. Indicator variable Participation as the dependent variable is participation in the implementation of P2K2 and participation in taking PKH fund benefits. While the indicator variables of individual characteristics as independent variables are the level of education (X1), age (X2), and number of dependents of the Family (X3). The results showed that the three individual characteristic variables influence recipients’ participation.


2019 ◽  
Vol 29 (Supplement_4) ◽  
Author(s):  
T Makovski ◽  
G Le Coroller ◽  
P Putrik ◽  
S Stranges ◽  
L Huiart ◽  
...  

Abstract Multimorbidity defined most commonly as co-existence of 2+ diseases is one of the major challenges of an ageing society. It is often accompanied with declining quality of life (QoL). The study aims to 1) assess the relationship between increasing number of diseases and QoL over time, 2) explore the differences between several European countries. Longitudinal data analysis performed on the relevant waves (2004 to 2017) of the Survey of Health, Ageing and Retirement in Europe (SHARE). Data were collected every two years among participants aged 50+. Health conditions were identified through an open-end questionnaire containing 17 prelisted conditions. QoL was evaluated by Control, Autonomy, Self-Realization and Pleasure questionnaire (CASP-12v). Maximum QoL score, describing the best state was 48; minimum, 12 points. Association between increasing number of diseases and QoL is being assessed with multilevel analysis accounting for time and clustering within household and country. Minimum follow-up is 2 time points. Confounding variables include age, sex, socio-economic status, social support and health care parameters. Preliminary findings show that 20 countries and 87,087 individuals participated in at least 2 waves; 80,041 answered CASP at least twice. Number of diseases when first reported was on average 1.65 (IQR=0,2) and increased to 1.88 (IQR=1,3) when last reported. Similarly, between first and last reported point QoL decreased on average by -0.32 (SD: ± 5.9); estimated by non-rescaled CASP scale. Greece showed the strongest decrease of -1.73 (SD: ± 6.36), while QoL increased in some countries, the most in Portugal for 0.76 (SD: ± 5.62). Our preliminary findings suggest high geographic variations in QoL, possibly driven by differential clustering of multimorbidity across Europe, design issues and other factors. This may underline the need for country-specific analysis and initiatives to address the growing burden of multimorbidity in our ageing populations. Key messages First longitudinal study to address this research questions across wide range of European countries using SHARE. Study accounts for large number of confounding factors owing to the abundance of collected information.


mSystems ◽  
2020 ◽  
Vol 5 (1) ◽  
Author(s):  
Zhichao Zhou ◽  
Yang Liu ◽  
Wei Xu ◽  
Jie Pan ◽  
Zhu-Hua Luo ◽  
...  

ABSTRACT Hydrothermal vents release reduced compounds and small organic carbon compounds into the surrounding seawater, providing essential substrates for microbial growth and bioenergy transformations. Despite the wide distribution of the marine benthic group E archaea (referred to as Hydrothermarchaeota) in the hydrothermal environment, little is known about their genomic repertoires and biogeochemical significance. Here, we studied four highly complete (>80%) metagenome-assembled genomes (MAGs) from a black smoker chimney and the surrounding sulfur-rich sediments on the South Atlantic Mid-Ocean Ridge and publicly available data sets (the Integrated Microbial Genomes system of the U.S. Department of Energy-Joint Genome Institute and NCBI SRA data sets). Genomic analysis suggested a wide carbon metabolic diversity of Hydrothermarchaeota members, including the utilization of proteins, lactate, and acetate; the anaerobic degradation of aromatics; the oxidation of C1 compounds (CO, formate, and formaldehyde); the utilization of methyl compounds; CO2 incorporation by the tetrahydromethanopterin-based Wood-Ljungdahl pathway; and participation in the type III ribulose-1,5-bisphosphate carboxylase/oxygenase-based Calvin-Benson-Bassham cycle. These microbes also potentially oxidize sulfur, arsenic, and hydrogen and engage in anaerobic respiration based on sulfate reduction and denitrification. Among the 140 MAGs reconstructed from the black smoker chimney microbial community (including Hydrothermarchaeota MAGs), community-level metabolic predictions suggested a redundancy of carbon utilization and element cycling functions and interactive syntrophic and sequential utilization of substrates. These processes might make various carbon and energy sources widely accessible to the microorganisms. Further, the analysis suggested that Hydrothermarchaeota members contained important functional components obtained from the community via lateral gene transfer, becoming a distinctive clade. This might serve as a niche-adaptive strategy for metabolizing heavy metals, C1 compounds, and reduced sulfur compounds. Collectively, the analysis provides comprehensive metabolic insights into the Hydrothermarchaeota. IMPORTANCE This study provides comprehensive metabolic insights into the Hydrothermarchaeota from comparative genomics, evolution, and community-level perspectives. Members of the Hydrothermarchaeota synergistically participate in a wide range of carbon-utilizing and element cycling processes with other microorganisms in the community. We expand the current understanding of community interactions within the hydrothermal sediment and chimney, suggesting that microbial interactions based on sequential substrate metabolism are essential to nutrient and element cycling.


2017 ◽  
Vol 42 (6) ◽  
pp. 563-570 ◽  
Author(s):  
Martin J. MacInnis ◽  
Chris McGlory ◽  
Martin J. Gibala ◽  
Stuart M. Phillips

Direct sampling of human skeletal muscle using the needle biopsy technique can facilitate insight into the biochemical and histological responses resulting from changes in exercise or feeding. However, the muscle biopsy procedure is invasive, and analyses are often expensive, which places pragmatic restraints on sample sizes. The unilateral exercise model can serve to increase statistical power and reduce the time and cost of a study. With this approach, 2 limbs of a participant are randomized to 1 of 2 treatments that can be applied almost concurrently or sequentially depending on the nature of the intervention. Similar to a typical repeated measures design, comparisons are made within participants, which increases statistical power by reducing the amount of between-person variability. A washout period is often unnecessary, reducing the time needed to complete the experiment and the influence of potential confounding variables such as habitual diet, activity, and sleep. Variations of the unilateral exercise model have been employed to investigate the influence of exercise, diet, and the interaction between the 2, on a wide range of variables including mitochondrial content, capillary density, and skeletal muscle hypertrophy. Like any model, unilateral exercise has some limitations: it cannot be used to study variables that potentially transfer across limbs, and it is generally limited to exercises that can be performed in pairs of treatments. Where appropriate, however, the unilateral exercise model can yield robust, well-controlled investigations of skeletal muscle responses to a wide range of interventions and conditions including exercise, dietary manipulation, and disuse or immobilization.


2006 ◽  
Vol 14 (02) ◽  
pp. 275-293 ◽  
Author(s):  
CHRISTOPHER S. OEHMEN ◽  
TJERK P. STRAATSMA ◽  
GORDON A. ANDERSON ◽  
GALYA ORR ◽  
BOBBIE-JO M. WEBB-ROBERTSON ◽  
...  

The future of biology will be increasingly driven by the fundamental paradigm shift from hypothesis-driven research to data-driven discovery research employing the growing volume of biological data coupled to experimental testing of new discoveries. But hardware and software limitations in the current workflow infrastructure make it impossible or intractible to use real data from disparate sources for large-scale biological research. We identify key technological developments needed to enable this paradigm shift involving (1) the ability to store and manage extremely large datasets which are dispersed over a wide geographical area, (2) development of novel analysis and visualization tools which are capable of operating on enormous data resources without overwhelming researchers with unusable information, and (3) formalisms for integrating mathematical models of biosystems from the molecular level to the organism population level. This will require the development of algorithms and tools which efficiently utilize high-performance compute power and large storage infrastructures. The end result will be the ability of a researcher to integrate complex data from many different sources with simulations to analyze a given system at a wide range of temporal and spatial scales in a single conceptual model.


2020 ◽  
Author(s):  
Alexandra Seleznyova ◽  
Alexey Yaroslavtcev ◽  
Olga Gavrichkova ◽  
Alexey Ryazanov ◽  
Julia Kovaleva ◽  
...  

&lt;p&gt;Urban trees and soil microbial communities are the key ecosystem components to provide the supporting, provisioning and regulating services that define citizen&amp;#8217;s well-being. Understanding the relationships between physiological states, age, species of trees and microbial functional properties are needed for a management of urban areas and landscapes' engineering. The research focuses on finding linkages between a wide range of trees&amp;#8217; properties monitored by smart TreeTalker technology and soil functional microbial indexes in Moscow megapolis.&lt;/p&gt;&lt;p&gt;The study was carried out on the RUDN University campus area (Moscow, Russia), where six tree species were selected (Pinus sylvestris, Populus tremula, Acer platanoides, Tilia cordata, Picea abies, Betula pendula). TreeTalker device was installed on the preselected five trees of each species for monitoring the sap flux, vertical stability (according to digital accelerometer), spectrums of canopy reflectance, trunk and canopy air temperature and humidity. Monitoring started in May 2019. The composite soil samples (0-10) were taken under each tree at the 0.5 m distance from its stand by augering in October 2019. In the samples, the microbial biomass carbon (MBC, SIR-method), basal respiration (BR), community level physiological profile (CLPP, MicroResp) and Shannon microbial diversity index (H&amp;#8217;) based on CLPP were determined.&lt;/p&gt;&lt;p&gt;Soil MBC content was significantly depended on tree species, increasing from A.platanoides to T.cordata (from 538 to 1445 &amp;#181;g C g&lt;sup&gt;-1&lt;/sup&gt;). The microbial diversity index was lowest in soil under A.platanoides (H&amp;#8217;=2.1) and the highest for B.pendula (H&amp;#8217;=2.4). The soil CLPP for A.platanoides was mainly shifted to microbial response on carboxylic acids with the low reaction on amino and phenolic acids compared to other trees species (e.g. B.pendula). Soil qCO&lt;sub&gt;2&lt;/sub&gt; (BR/MBC ratio) was positively related to trees&amp;#8217; age (r=0.8). Response to carboxylic acids (especially oxalic) had the highest correlation with physiological properties of the trees: trunk moisture, photochemical reflectance index and vertical stability (r &gt; -0.5).&lt;/p&gt;&lt;p&gt;Current research was financially supported by Russian Science Foundation [No 19-77-30012].&lt;/p&gt;


Sign in / Sign up

Export Citation Format

Share Document