Compositional analysis of dietary patterns

2018 ◽  
Vol 28 (9) ◽  
pp. 2834-2847 ◽  
Author(s):  
M Solans ◽  
G Coenders ◽  
R Marcos-Gragera ◽  
A Castelló ◽  
E Gràcia-Lavedan ◽  
...  

Instead of looking at individual nutrients or foods, dietary pattern analysis has emerged as a promising approach to examine the relationship between diet and health outcomes. Despite dietary patterns being compositional (i.e. usually a higher intake of some foods implies that less of other foods are being consumed), compositional data analysis has not yet been applied in this setting. We describe three compositional data analysis approaches (compositional principal component analysis, balances and principal balances) that enable the extraction of dietary patterns by using control subjects from the Spanish multicase-control (MCC-Spain) study. In particular, principal balances overcome the limitations of purely data-driven or investigator-driven methods and present dietary patterns as trade-offs between eating more of some foods and less of others.

2021 ◽  
Vol 20 (1) ◽  
Author(s):  
Junkang Zhao ◽  
Zhiyao Li ◽  
Qian Gao ◽  
Haifeng Zhao ◽  
Shuting Chen ◽  
...  

Abstract Background Dietary pattern analysis is a promising approach to understanding the complex relationship between diet and health. While many statistical methods exist, the literature predominantly focuses on classical methods such as dietary quality scores, principal component analysis, factor analysis, clustering analysis, and reduced rank regression. There are some emerging methods that have rarely or never been reviewed or discussed adequately. Methods This paper presents a landscape review of the existing statistical methods used to derive dietary patterns, especially the finite mixture model, treelet transform, data mining, least absolute shrinkage and selection operator and compositional data analysis, in terms of their underlying concepts, advantages and disadvantages, and available software and packages for implementation. Results While all statistical methods for dietary pattern analysis have unique features and serve distinct purposes, emerging methods warrant more attention. However, future research is needed to evaluate these emerging methods’ performance in terms of reproducibility, validity, and ability to predict different outcomes. Conclusion Selection of the most appropriate method mainly depends on the research questions. As an evolving subject, there is always scope for deriving dietary patterns through new analytic methodologies.


GigaScience ◽  
2019 ◽  
Vol 8 (9) ◽  
Author(s):  
Thomas P Quinn ◽  
Ionas Erb ◽  
Greg Gloor ◽  
Cedric Notredame ◽  
Mark F Richardson ◽  
...  

Abstract Background Next-generation sequencing (NGS) has made it possible to determine the sequence and relative abundance of all nucleotides in a biological or environmental sample. A cornerstone of NGS is the quantification of RNA or DNA presence as counts. However, these counts are not counts per se: their magnitude is determined arbitrarily by the sequencing depth, not by the input material. Consequently, counts must undergo normalization prior to use. Conventional normalization methods require a set of assumptions: they assume that the majority of features are unchanged and that all environments under study have the same carrying capacity for nucleotide synthesis. These assumptions are often untestable and may not hold when heterogeneous samples are compared. Results Methods developed within the field of compositional data analysis offer a general solution that is assumption-free and valid for all data. Herein, we synthesize the extant literature to provide a concise guide on how to apply compositional data analysis to NGS count data. Conclusions In highlighting the limitations of total library size, effective library size, and spike-in normalizations, we propose the log-ratio transformation as a general solution to answer the question, “Relative to some important activity of the cell, what is changing?”


2020 ◽  
Author(s):  
Kamila Fačevicová ◽  
Tomáš Matys Grygar ◽  
Karel Hron ◽  
Jitka Elznicová

<p>Fluvial sediments datasets, similarly as other types of a concentration based data, are typical by their relative nature and therefore they need preprocessing or normalization prior to the main statistical analysis. In the geochemical practice, several normalization methods are used, like a simple normalization of the target element concentration with the concentration of the reference (conservative, lithogenic) one, double normalization or concentration conversion to local enrichment factor. As an alternative to these methods, the approach using the principles of compositional data analysis (CoDA) can be considered.  Instead of the standard statistical analytical methods, like ordinary least squares regression, correlation of principal component analysis (PCA), applied on the raw or the target element normalized concentrations, the CoDA methods consider the relative structure of the whole dataset. CoDA together with the use of robust statistical methods, which are down weighting the influence of the outlying observations, have a potential to provide more accurate results. This property is demonstrated and discussed on the base of dataset from mapping the sediments from the Skalka Reservoir in the Ohře River, Czech Republic, and its tributaries. Mainly the performance of the robust versions of regression, correlation and principal components analysis, respecting the CoDA principles, will be presented and the way to them will be explained. </p>


Nutrients ◽  
2019 ◽  
Vol 11 (3) ◽  
pp. 582 ◽  
Author(s):  
Irene Rodríguez-Gómez ◽  
Asier Mañas ◽  
José Losa-Reyna ◽  
Leocadio Rodríguez-Mañas ◽  
Sebastien Chastin ◽  
...  

The aim of this study was to determine the relationship between bone mass (BM) and physical activity (PA) and sedentary behavior (SB) according to frailty status and sex using compositional data analysis. We analyzed 871 older people with an adequate nutritional status. Fried criteria were used to classify by frailty status. Time spent in SB, light intensity PA (LPA) and moderate-to-vigorous intensity PA (MVPA) was assessed from accelerometry for 7 days. BM was determined by dual-energy X-ray absorptiometry (DXA). The combined effect of PA and SB was significantly associated with BM in robust men and women (p ≤ 0.05). In relation to the other behaviors, SB was negatively associated with BM in robust men while BM was positively associated with SB and negatively with LPA and MVPA in robust women. Moreover, LPA also was positively associated with arm BM (p ≤ 0.01). Finally, in pre-frail women, BM was positively associated with MVPA. In our sample, to decrease SB could be a good strategy to improve BM in robust men. In contrast, in pre-frail women, MVPA may be an important factor to consider regarding bone health.


2018 ◽  
Author(s):  
Thomas P. Quinn ◽  
Ionas Erb ◽  
Greg Gloor ◽  
Cedric Notredame ◽  
Mark F. Richardson ◽  
...  

AbstractNext-generation sequencing (NGS) has made it possible to determine the sequence and relative abundance of all nucleotides in a biological or environmental sample. Today, NGS is routinely used to understand many important topics in biology from human disease to microorganism diversity. A cornerstone of NGS is the quantification of RNA or DNA presence as counts. However, these counts are not counts per se: the magnitude of the counts are determined arbitrarily by the sequencing depth, not by the input material. Consequently, counts must undergo normalization prior to use. Conventional normalization methods require a set of assumptions: they assume that the majority of features are unchanged, and that all environments under study have the same carrying capacity for nucleotide synthesis. These assumptions are often untestable and may not hold when comparing heterogeneous samples (e.g., samples collected across distinct cancers or tissues). Instead, methods developed within the field of compositional data analysis offer a general solution that is assumption-free and valid for all data. In this manuscript, we synthesize the extant literature to provide a concise guide on how to apply compositional data analysis to NGS count data. In doing so, we review zero replacement, differential abundance analysis, and within-group and between-group coordination analysis. We then discuss how this pipeline can accommodate complex study design, facilitate the analysis of vertically and horizontally integrated data, including multiomics data, and further extend to single-cell sequencing data. In highlighting the limitations of total library size, effective library size, and spike-in normalizations, we propose the log-ratio transformation as a general solution to answer the question, “Relative to some important activity of the cell, what is changing?”. Taken together, this manuscript establishes the first fully comprehensive analysis protocol that is suitable for any and all -omics data.


2021 ◽  
Vol 2021 ◽  
pp. 1-9
Author(s):  
Liu Zhang ◽  
Hongjuan Li ◽  
Yimin Zhang ◽  
Zhenxing Kong ◽  
Ting Zhang ◽  
...  

The purpose of this study was to investigate the relationship between body compositions and bone mineral density (BMD) and the effect of composition substitution among Chinese children and adolescents without the influence of multicollinearity. A dual-energy X-ray absorptiometry scan was used to determine the amount of truncal fat (TF), nontruncal fat (NTF), fat-free mass (FFM), and BMD. The compositional data analysis and the compositional proportional substitution analysis were conducted to determine the effect of each part of body compositions on BMD and its substitution effects. Four hundred sixty-six (466) (boys: 51.9%) participants completed this cross-sectional study. For girls, in the overweight group, the relationship between TF and the BMD was positive ( β = 2.943 e − 01 , p = 0.006 ) while the NTF showed the opposite trend ( β = − 2.358 e − 01 , p = 0.009 ). When 4% NTF or FFM was substituted by TF, the BMD increased by about 0.1 and 0.05 units ( p < 0.05 ), respectively. For boys, the association between FFM and BMD was statistically positive ( β = 4.091 e − 02 , p = 0.0001 ). There was a positive correlation between TF and BMD ( β = 7.963 e − 02 , p = 0.036 ). But with the increase of BMI, this correlation shifted in the opposite direction. In conclusion, compared to TF and NTF, FFM had a better protective effect on BMD, especially for boys. The risk of NTF accumulation on BMD was greater than that of TF accumulation. Compared with girls, boys were more sensitive to the amount of TF.


Author(s):  
Dorothea Dumuid ◽  
Željko Pedišić ◽  
Javier Palarea-Albaladejo ◽  
Josep Antoni Martín-Fernández ◽  
Karel Hron ◽  
...  

In recent years, the focus of activity behavior research has shifted away from univariate paradigms (e.g., physical activity, sedentary behavior and sleep) to a 24-h time-use paradigm that integrates all daily activity behaviors. Behaviors are analyzed relative to each other, rather than as individual entities. Compositional data analysis (CoDA) is increasingly used for the analysis of time-use data because it is intended for data that convey relative information. While CoDA has brought new understanding of how time use is associated with health, it has also raised challenges in how this methodology is applied, and how the findings are interpreted. In this paper we provide a brief overview of CoDA for time-use data, summarize current CoDA research in time-use epidemiology and discuss challenges and future directions. We use 24-h time-use diary data from Wave 6 of the Longitudinal Study of Australian Children (birth cohort, n = 3228, aged 10.9 ± 0.3 years) to demonstrate descriptive analyses of time-use compositions and how to explore the relationship between daily time use (sleep, sedentary behavior and physical activity) and a health outcome (in this example, adiposity). We illustrate how to comprehensively interpret the CoDA findings in a meaningful way.


Sign in / Sign up

Export Citation Format

Share Document