scholarly journals Lineage EM Algorithm for Inferring Latent States from Cellular Lineage Trees

2018 ◽  
Author(s):  
So Nakashima ◽  
Yuki Sughiyama ◽  
Tetsuya J. Kobayashi

Phenotypic variability in a population of cells can work as the bet-hedging of the cells under an unpredictably changing environment, the typical example of which is the bacterial persistence. To understand the strategy to control such phenomena, it is indispensable to identify the phenotype of each cell and its inheritance. Although recent advancements in microfluidic technology offer us useful lineage data, they are insufficient to directly identify the phenotypes of the cells. An alternative approach is to infer the phenotype from the lineage data by latent-variable estimation. To this end, however, we must resolve the bias problem in the inference from lineage called survivorship bias. In this work, we clarify how the survivor bias distorts statistical estimations. We then propose a latent-variable estimation algorithm without the survivorship bias from lineage trees based on an expectation-maximization (EM) algorithm, which we call Lineage EM algorithm (LEM). LEM provides a statistical method to identify the traits of the cells applicable to various kinds of lineage data.

2020 ◽  
Vol 36 (9) ◽  
pp. 2829-2838 ◽  
Author(s):  
So Nakashima ◽  
Yuki Sughiyama ◽  
Tetsuya J Kobayashi

Abstract Summary Phenotypic variability in a population of cells can work as the bet-hedging of the cells under an unpredictably changing environment, the typical example of which is the bacterial persistence. To understand the strategy to control such phenomena, it is indispensable to identify the phenotype of each cell and its inheritance. Although recent advancements in microfluidic technology offer us useful lineage data, they are insufficient to directly identify the phenotypes of the cells. An alternative approach is to infer the phenotype from the lineage data by latent-variable estimation. To this end, however, we must resolve the bias problem in the inference from lineage called survivorship bias. In this work, we clarify how the survivorship bias distorts statistical estimations. We then propose a latent-variable estimation algorithm without the survivorship bias from lineage trees based on an expectation–maximization (EM) algorithm, which we call lineage EM algorithm (LEM). LEM provides a statistical method to identify the traits of the cells applicable to various kinds of lineage data. Availability and implementation An implementation of LEM is available at https://github.com/so-nakashima/Lineage-EM-algorithm. Supplementary information Supplementary data are available at Bioinformatics online.


2018 ◽  
Author(s):  
Erika E Kuchen ◽  
Nils Becker ◽  
Nina Claudino ◽  
Thomas Höfer

Mammalian cell proliferation is controlled by mitogens. However, how proliferation is coordinated with cell growth is poorly understood. Here we show that statistical properties of cell lineage trees – the cell-cycle length correlations within and across generations – reveal how cell growth controls proliferation. Analyzing extended lineage trees with latent-variable models, we find that two antagonistic heritable variables account for the observed cycle-length correlations. Using molecular perturbations of mTOR and MYC we identify these variables as cell size and regulatory license to divide, which are coupled through a minimum-size checkpoint. The checkpoint is relevant only for fast cell cycles, explaining why growth control of mammalian cell proliferation has remained elusive. Thus, correlated fluctuations of the cell cycle encode its regulation.


2020 ◽  
Vol 117 (29) ◽  
pp. 17240-17248 ◽  
Author(s):  
Sonali Chaturvedi ◽  
Jonathan Klein ◽  
Noam Vardi ◽  
Cynthia Bolovan-Fritts ◽  
Marie Wolf ◽  
...  

Probabilistic bet hedging, a strategy to maximize fitness in unpredictable environments by matching phenotypic variability to environmental variability, is theorized to account for the evolution of various fate-specification decisions, including viral latency. However, the molecular mechanisms underlying bet hedging remain unclear. Here, we report that large variability in protein abundance within individual herpesvirus virion particles enables probabilistic bet hedging between viral replication and latency. Superresolution imaging of individual virions of the human herpesvirus cytomegalovirus (CMV) showed that virion-to-virion levels of pp71 tegument protein—the major viral transactivator protein—exhibit extreme variability. This super-Poissonian tegument variability promoted alternate replicative strategies: high virion pp71 levels enhance viral replicative fitness but, strikingly, impede silencing, whereas low virion pp71 levels reduce fitness but promote silencing. Overall, the results indicate that stochastic tegument packaging provides a mechanism enabling probabilistic bet hedging between viral replication and latency.


Entropy ◽  
2020 ◽  
Vol 22 (10) ◽  
pp. 1107
Author(s):  
Carlotta Langer ◽  
Nihat Ay

Complexity measures in the context of the Integrated Information Theory of consciousness try to quantify the strength of the causal connections between different neurons. This is done by minimizing the KL-divergence between a full system and one without causal cross-connections. Various measures have been proposed and compared in this setting. We will discuss a class of information geometric measures that aim at assessing the intrinsic causal cross-influences in a system. One promising candidate of these measures, denoted by ΦCIS, is based on conditional independence statements and does satisfy all of the properties that have been postulated as desirable. Unfortunately it does not have a graphical representation, which makes it less intuitive and difficult to analyze. We propose an alternative approach using a latent variable, which models a common exterior influence. This leads to a measure ΦCII, Causal Information Integration, that satisfies all of the required conditions. Our measure can be calculated using an iterative information geometric algorithm, the em-algorithm. Therefore we are able to compare its behavior to existing integrated information measures.


2002 ◽  
Vol 27 (3) ◽  
pp. 291-317 ◽  
Author(s):  
Natasha Rossi ◽  
Xiaohui Wang ◽  
James O. Ramsay

The methods of functional data analysis are used to estimate item response functions (IRFs) nonparametrically. The EM algorithm is used to maximize the penalized marginal likelihood of the data. The penalty controls the smoothness of the estimated IRFs, and is chosen so that, as the penalty is increased, the estimates converge to shapes closely represented by the three-parameter logistic family. The one-dimensional latent trait model is recast as a problem of estimating a space curve or manifold, and, expressed in this way, the model no longer involves any latent constructs, and is invariant with respect to choice of latent variable. Some results from differential geometry are used to develop a data-anchored measure of ability and a new technique for assessing item discriminability. Functional data-analytic techniques are used to explore the functional variation in the estimated IRFs. Applications involving simulated and actual data are included.


2019 ◽  
Vol 40 (1) ◽  
pp. 22-30
Author(s):  
Xin Liu ◽  
Hang Zhang ◽  
Pengbo Zhu ◽  
Xianqiang Yang ◽  
Zhiwei Du

Purpose This paper aims to investigate an identification strategy for the nonlinear state-space model (SSM) in the presence of an unknown output time-delay. The equations to estimate the unknown model parameters and output time-delay are derived simultaneously in the proposed strategy. Design/methodology/approach The unknown integer-valued time-delay is processed as a latent variable which is uniformly distributed in a priori known range. The estimations of the unknown time-delay and model parameters are both realized using the Expectation-Maximization (EM) algorithm, which has a good performance in dealing with latent variable issues. Moreover, the particle filter (PF) with an unknown time-delay is introduced to calculated the Q-function of the EM algorithm. Findings Although amounts of effective approaches for nonlinear SSM identification have been developed in the literature, the problem of time-delay is not considered in most of them. The time-delay is commonly existed in industrial scenario and it could cause extra difficulties for industrial process modeling. The problem of unknown output time-delay is considered in this paper, and the validity of the proposed approach is demonstrated through the numerical example and a two-link manipulator system. Originality/value The novel approach to identify the nonlinear SSM in the presence of an unknown output time-delay with EM algorithm is put forward in this work.


2019 ◽  
Vol 49 (1) ◽  
pp. 117-146
Author(s):  
Rexford M. Akakpo ◽  
Michelle Xia ◽  
Alan M. Polansky

AbstractIn insurance underwriting, misrepresentation represents the type of insurance fraud when an applicant purposely makes a false statement on a risk factor that may lower his or her cost of insurance. Under the insurance ratemaking context, we propose to use the expectation-maximization (EM) algorithm to perform maximum likelihood estimation of the regression effects and the prevalence of misrepresentation for the misrepresentation model proposed by Xia and Gustafson [(2016) The Canadian Journal of Statistics, 44, 198–218]. For applying the EM algorithm, the unobserved status of misrepresentation is treated as a latent variable in the complete-data likelihood function. We derive the iterative formulas for the EM algorithm and obtain the analytical form of the Fisher information matrix for frequentist inference on the parameters of interest for lognormal losses. We implement the algorithm and demonstrate that valid inference can be obtained on the risk effect despite the unobserved status of misrepresentation. Applying the proposed algorithm, we perform a loss severity analysis with the Medical Expenditure Panel Survey data. The analysis reveals not only the potential impact misrepresentation may have on the risk effect but also statistical evidence on the presence of misrepresentation in the self-reported insurance status.


Sign in / Sign up

Export Citation Format

Share Document