scholarly journals Phenomics data processing: A plot-level model for repeated measurements to extract the timing of key stages and quantities at defined time points

2021 ◽  
Author(s):  
Lukas Roth ◽  
María Xosé Rodríguez-Álvarez ◽  
Fred van Eeuwijk ◽  
Hans-Peter Piepho ◽  
Andreas Hund

Decision-making in breeding increasingly depends on the ability to capture and predict crop responses to changing environmental factors. Advances in crop modeling as well as high-throughput field phenotyping (HTFP) hold promise to provide such insights. Processing HTFP data is an interdisciplinary task that requires broad knowledge on experimental design, measurement techniques, feature extraction, dynamic trait modeling, and prediction of genotypic values using statistical models. To get an overview of sources of variations in HTFP, we develop a general plot-level model for repeated measurements. Based on this model, we propose a seamless stage-wise process that allows to carry on estimated means and variances from stage to stage and approximates the gold standard of a single-stage analysis. The process builds on the extraction of three intermediate trait categories; (1) timing of key stages, (2) quantities at defined time points or periods, and (3) dose-response curves. In a first stage, these intermediate traits are extracted from low-level traits' time series (e.g., canopy height) using P-splines and the quarter of maximum elongation rate method (QMER), as well as final height percentiles. In a second and third stage, extracted traits are further processed using a stage-wise linear mixed model analysis. Using a wheat canopy growth simulation to generate canopy height time series, we demonstrate the suitability of the stage-wise process for traits of the first two above-mentioned categories. Results indicate that, for the first stage, the P-spline/QMER method was more robust than the percentile method. In the subsequent two-stage linear mixed model processing, weighting the second and third stage with error variance estimates from the previous stages improved the root mean squared error. We conclude that processing phenomics data in stages represents a feasible approach if using appropriate weighting through all stages. P-splines in combination with the QMER method are suitable tools to extract timing of key stages and quantities at defined time points from HTFP data.

Plants ◽  
2021 ◽  
Vol 10 (2) ◽  
pp. 362
Author(s):  
Ioannis Spyroglou ◽  
Jan Skalák ◽  
Veronika Balakhonova ◽  
Zuzana Benedikty ◽  
Alexandros G. Rigas ◽  
...  

Plants adapt to continual changes in environmental conditions throughout their life spans. High-throughput phenotyping methods have been developed to noninvasively monitor the physiological responses to abiotic/biotic stresses on a scale spanning a long time, covering most of the vegetative and reproductive stages. However, some of the physiological events comprise almost immediate and very fast responses towards the changing environment which might be overlooked in long-term observations. Additionally, there are certain technical difficulties and restrictions in analyzing phenotyping data, especially when dealing with repeated measurements. In this study, a method for comparing means at different time points using generalized linear mixed models combined with classical time series models is presented. As an example, we use multiple chlorophyll time series measurements from different genotypes. The use of additional time series models as random effects is essential as the residuals of the initial mixed model may contain autocorrelations that bias the result. The nature of mixed models offers a viable solution as these can incorporate time series models for residuals as random effects. The results from analyzing chlorophyll content time series show that the autocorrelation is successfully eliminated from the residuals and incorporated into the final model. This allows the use of statistical inference.


2021 ◽  
Author(s):  
Miriam Sieg ◽  
Lina Katrin Sciesielski ◽  
Karin Kirschner ◽  
Jochen Kruppa

Abstract Background: In longitudinal studies, observations are made over time. Hence, the single observations at each time point are dependent, making them a repeated measurement. In this work, we explore a different, counterintuitive setting: At each developmental time point, a lethal observation is performed on the pregnant or nursing mother. Therefore, the single time points are independent. Furthermore, the observation in the offspring at each time point is correlated with each other because each litter consists of several (genetically linked) littermates. In addition, the observed time series is short from a statistical perspective as animal ethics prevent killing more mother mice than absolutely necessary, and murine development is short anyway. We solve these challenges by using multiple contrast tests and visualizing the change point by the use of confidence intervals.Results: We used linear mixed models to model the variability of the mother. The estimates from the linear mixed model are then used in multiple contrast tests.There are a variety of contrasts and intuitively, we would use the Changepoint method. However, it does not deliver satisfying results. Interestingly, we found two other contrasts, both capable of answering different research questions in change point detection: i) Should a single point with change direction be found, or ii) Should the overall progression be determined? The Sequen contrast answers the first, the McDermott the second. Confidence intervals deliver effect estimates for the strength of the potential change point. Therefore, the scientist can define a biologically relevant limit of change depending on the research question.Conclusion: We present a solution with effect estimates for short independent time series with observations nested at a given time point. Multiple contrast tests produce confidence intervals, which allow determining the position of change points or to visualize the expression course over time. We suggest to use McDermott’s method to determine if there is an overall significant change within the time frame, while Sequen is better in determining specific change points. In addition, we offer a short formula for the estimation of the maximal length of the time series.


2014 ◽  
Vol 2014 ◽  
pp. 1-8 ◽  
Author(s):  
Michael Schneider ◽  
Anne Engel ◽  
Peter A. Fasching ◽  
Lothar Häberle ◽  
Elisabeth B. Binder ◽  
...  

Purpose. The aim of this study was to investigate whether single nucleotide polymorphisms (SNPs) in genes of the stress hormone signaling pathway, specificallyFKBP5,NR3C1, andCRHR1, are associated with depressive symptoms during and after pregnancy.Methods. The Franconian Maternal Health Evaluation Study (FRAMES) recruited healthy pregnant women prospectively for the assessment of maternal and fetal health including the assessment of depressiveness. The German version of the 10-item Edinburgh Postnatal Depression Scale (EPDS) was completed at three time points in this prospective cohort study. Visit 1 was at study entry in the third trimester of the pregnancy, visit 2 was shortly after birth, and visit 3 was 6–8 months after birth. Germline DNA was collected from 361 pregnant women. Nine SNPs in the above mentioned genes were genotyped. After construction of haplotypes for each gene, a multifactorial linear mixed model was performed to analyse the depression values over time.Results. EPDS values were within expected ranges and comparable to previously published studies. Neither did the depression scores differ for comparisons among haplotypes at fixed time points nor did the change over time differ among haplotypes for the examined genes. No haplotype showed significant associations with depressive symptoms severity during pregnancy or the postpartum period.Conclusion. The analysed candidate haplotypes inFKBP5,NR3C1, andCRHR1did not show an association with depression scores as assessed by EPDS in this cohort of healthy unselected pregnant women.


2021 ◽  
pp. 0272989X2110038
Author(s):  
Felix Achana ◽  
Daniel Gallacher ◽  
Raymond Oppong ◽  
Sungwook Kim ◽  
Stavros Petrou ◽  
...  

Economic evaluations conducted alongside randomized controlled trials are a popular vehicle for generating high-quality evidence on the incremental cost-effectiveness of competing health care interventions. Typically, in these studies, resource use (and by extension, economic costs) and clinical (or preference-based health) outcomes data are collected prospectively for trial participants to estimate the joint distribution of incremental costs and incremental benefits associated with the intervention. In this article, we extend the generalized linear mixed-model framework to enable simultaneous modeling of multiple outcomes of mixed data types, such as those typically encountered in trial-based economic evaluations, taking into account correlation of outcomes due to repeated measurements on the same individual and other clustering effects. We provide new wrapper functions to estimate the models in Stata and R by maximum and restricted maximum quasi-likelihood and compare the performance of the new routines with alternative implementations across a range of statistical programming packages. Empirical applications using observed and simulated data from clinical trials suggest that the new methods produce broadly similar results as compared with Stata’s merlin and gsem commands and a Bayesian implementation in WinBUGS. We highlight that, although these empirical applications primarily focus on trial-based economic evaluations, the new methods presented can be generalized to other health economic investigations characterized by multivariate hierarchical data structures.


Author(s):  
Yuhua Chen ◽  
Hainan Wu ◽  
Wenguo Yang ◽  
Wei Zhao ◽  
Chunfa Tong

Abstract With the advances in high-throughput sequencing technologies, it is not difficult to extract tens of thousands of single nucleotide polymorphisms (SNPs) across many individuals in a fast and cheap way, making it possible to perform genome-wide association studies (GWAS) of quantitative traits in outbred forest trees. It is very valuable to apply traditional breeding experiments in GWAS for identifying genome variants associated to ecologically and economically important traits in Populus. Here, we reported a GWAS of tree height measured at multiple time points from a randomized complete block design (RCBD), which was established with clones from an F1 hybrid population of Populus deltoides and Populus simonii. A total of 22,670 SNPs across 172 clones in the RCBD were obtained with restriction site-associated DNA sequencing (RADseq) technology. The multivariate mixed linear model was applied by incorporating the pedigree relationship matrix of individuals to test the association of each SNP to the tree heights over 8 time points. Consequently, 41 SNPs were identified significantly associated to the tree height under the p-value threshold determined by Bonferroni correction at the significant level of 0.01. These SNPs were distributed on all but 2 chromosomes (Chr02 and Chr18) and explained the phenotypic variance ranged from 0.26% to 2.64%, amounting to 63.68% in total. Comparison with previous mapping studies for poplar height as well as the candidate genes of these detected SNPs were also investigated. We therefore demonstrated that the application of multivariate linear mixed model to the longitudinal phenotypic data from the traditional breeding experimental design facilitated to identify far more genome-wide variants for tree height in poplar. The significant SNPs identified in this study would enhance understanding of molecular mechanism for growth traits and would accelerate marker-assisted breeding programs in Populus.


2017 ◽  
Author(s):  
Liat Shenhav ◽  
Ori Furman ◽  
Leah Briscoe ◽  
Michael Thompson ◽  
Itzhak Mizrahi ◽  
...  

AbstractGiven the highly dynamic and complex nature of the human gut microbial community, the ability to identify and predict time-dependent compositional patterns of microbes is crucial to our understanding of the structure and function of this ecosystem. One factor that could affect such time-dependent patterns is microbial interactions, wherein community composition at a given time point affects the microbial composition at a later time point. However, the field has not yet settled on the degree of this effect. Specifically, it has been recently suggested that only a minority of the operational taxonomic units (OTUs) depend on the microbial composition in earlier times. To address the issue of identifying and predicting temporal microbial patterns we developed a new model, MTV-LMM (Microbial Temporal Variability Linear Mixed Model), a linear mixed model for the prediction of the microbial community temporal dynamics based on the community composition at previous time stamps. MTV-LMM can identify time-dependent microbes in time series datasets, which can then be used to analyze the trajectory of the microbiome over time. We evaluated the performance of MTV-LMM on three human microbiome time series datasets, and found that MTV-LMM significantly outperforms all existing methods for microbiome time series modeling. Particularly, we demonstrate that the effect of the microbial composition in previous time points on the abundance levels of an OTU at a later time point is underestimated by a factor of at least 10 when applying previous approaches. Using MTV-LMM, we demonstrate that a considerable proportion of the human gut microbiome, both in infants and adults, has a significant time-dependent component that can be predicted based on microbiome composition in earlier time points. This suggests that microbiome composition at a given time point is a major factor in defining future microbiome composition and that this phenomenon is considerably more common than previously reported for the human gut microbiome.


CJEM ◽  
2019 ◽  
Vol 21 (S1) ◽  
pp. S46-S47
Author(s):  
A. Cournoyer ◽  
S. Cossette ◽  
J. Paquet ◽  
R. Daoust ◽  
M. Marquis ◽  
...  

Introduction: Near-infrared spectroscopy (NIRS) can be used to monitor the oxygen saturation of hemoglobin in any given superficial tissue. However, the measurements provided by different oximeters can vary a lot. Little is known about the specific patient characteristics that could affect the inter-device agreement of tissular oximeters. This study aimed to evaluate the association between the quantity of subcutaneous fat (assessed by skinfold thickness) and the inter-device agreement of two tissue oximeters, the INVOS 5100c and the Equanox 7600. Methods: In this prospective cohort study, tissue saturations and skinfold thickness were measured at four different sites on both sides of the body in healthy adult (≥18 years old) volunteers. The association between the quantity of subcutaneous fat (assessed by skinfold thickness) and the inter-device agreement (absolute difference between the oximetry values provided by the two oximeters) was first assessed with a Pearson's correlation and a scatter plot. Subsequently, a linear mixed model was used to evaluate the impact of the subcutaneous fat and other covariables (age, sex) on the inter-device agreement while adjusting for the repeated measurements across different sites for the same volunteers. Results: From January to March 2015, 53 healthy volunteers were included in this study with ages ranging between 20 and 81 years old, on which a total of 848 measures were taken. Higher skinfold measures were associated with an increase in the difference between measures provided by both oximeters (Slope = -0.59, Pearson correlation coefficient = -0.51, p < 0.001). This observed association persisted in a linear mixed model (-0.48 [95% confidence interval {CI}-0.61 to -0.36], p < 0.001). The sex of the volunteers also influenced the inter-oximeter agreement (Women:-5.77 [95%CI -8.43 to -3.11], p < 0.001), as well as the forearm sites (Left forearm: −7.16 [95%CI -9.85 to –4.47], p < 0.001; right forearm: −7.01 [95%CI -9.61 to −4.40], p < 0.001). Conclusion: The quantity of subcutaneous fat, as well as the sex of the volunteers and the measurement sites, impacted the inter-device agreement of two commonly used oximeters. Given these findings, monitoring using tissue oximetry should be interpreted with great care when there is a significant quantity of subcutaneous fat.


2020 ◽  
pp. 1471082X2093601
Author(s):  
Mirko Signorelli ◽  
Pietro Spitali ◽  
Roula Tsonaka

We present a new modelling approach for longitudinal overdispersed counts that is motivated by the increasing availability of longitudinal RNA-sequencing experiments. The distribution of RNA-seq counts typically exhibits overdispersion, zero-inflation and heavy tails; moreover, in longitudinal designs repeated measurements from the same subject are typically (positively) correlated. We propose a generalized linear mixed model based on the Poisson–Tweedie distribution that can flexibly handle each of the aforementioned features of longitudinal overdispersed counts. We develop a computational approach to accurately evaluate the likelihood of the proposed model and to perform maximum likelihood estimation. Our approach is implemented in the R package ptmixed, which can be freely downloaded from CRAN. We assess the performance of ptmixed on simulated data, and we present an application to a dataset with longitudinal RNA-sequencing measurements from healthy and dystrophic mice. The applicability of the Poisson–Tweedie mixed-effects model is not restricted to longitudinal RNA-seq data, but it extends to any scenario where non-independent measurements of a discrete overdispersed response variable are available.


2016 ◽  
Vol 28 (8) ◽  
pp. 907-919 ◽  
Author(s):  
Esther O. W. Chow ◽  
Kelvin K. W. Yau

Purpose: This article assessed the effectiveness of social networking strategies (networking strategic initiative [NSI]) to overcome stressful life events experienced in normal aging in Hong Kong. Method: A three-wave quasi-experimental panel design with an overall sample consisting of n = 288 Chinese elderly placed into two groups: NSI group: n1 = 175 and comparison group: n2 = 113. Face-to-face structured interviews were conducted for over 30 months. Five outcome measures including subjective well-being, self-esteem, locus of control, sense of belonging, and collective power were investigated, using a generalized linear mixed model for repeated measurements. Results: Findings revealed those who were continuously active throughout the intervention period experienced considerable increases in self-esteem and sense of belonging. Conclusion: No appreciable effects on any of the five outcome measures were found for those who were enrolled in the program and were inactive. The findings provide significant implications for future practice with community-dwelling elderly Chinese populations in Hong Kong and elsewhere.


Author(s):  
E O’ Connor ◽  
F M McGovern ◽  
D T Byrne ◽  
T M Boland ◽  
E Dunne ◽  
...  

Abstract Portable accumulation chambers (PAC) enable gaseous emissions from small ruminants to be measured over a 50 min period, to date however, the repeatability of consecutive days of measurement in the PAC has not been investigated. The objectives of this study were to investigate: 1) the repeatability of consecutive days of gaseous measurements in the PAC, 2) the number of days required to achieve precise gaseous measurements, and 3) to develop a prediction equation for gaseous emissions in sheep. A total of 48 ewe lambs (c. 10 to 11 mo of age) were randomly divided into four measurement groups each day, for 17 consecutive days. Gaseous measurements were conducted between 0800 h and 1200 h daily. Animals were removed from perennial ryegrass silage for at least 1 h before measurements in the PAC and animals were assigned randomly to each of the 12 chambers. Methane (CH4; ppm) concentration, oxygen (O2; percentage) and carbon dioxide (CO2; percentage) were measured at 3 time points (0, 25, and 50 min after entry of the first animal into the first chamber). To quantify the effect of animal and day variation on gaseous emissions, between-animal, between-day and error variances were calculated for each gaseous measurement using a linear mixed model. The number of days required to gain a certain precision (defined as the 95% confidence interval (CI) range) for each gaseous measurement was also calculated. For all 3 gases the between-day variance (39% to 40%) accounted for a larger proportion of total variance compared to between-animal variance, while the repeatability of 17 consecutive days of measurement was 0.36, 0.31 and 0.23 for CH4, CO2 and O2, respectively. Correlations between consecutive days of measurement were strong for all 3 gases; the strongest correlation between d 1 and the remaining days for CH4, CO2 and O2 was 0.71 (d 1 and d 6), 0.77 (d 1 and d 2) and 0.83 (d 1 and d 5), respectively. A high level of precision was achieved when gaseous measurements from PAC were taken over 3 consecutive days. The prediction equation over-estimated gaseous production for all 3 gases: the correlations between actual and predicted gaseous output ranged from 0.67 to 0.71, with the r 2 ranging from 0.45 to 0.71. Results from this study will aid the refinement of the protocol for the measurement of gaseous emissions in sheep using the PAC.


Sign in / Sign up

Export Citation Format

Share Document