scholarly journals Estimation of cell lineage trees by maximum-likelihood phylogenetics

2019 ◽  
Author(s):  
Jean Feng ◽  
William S DeWitt ◽  
Aaron McKenna ◽  
Noah Simon ◽  
Amy Willis ◽  
...  

AbstractCRISPR technology has enabled large-scale cell lineage tracing for complex multicellular organisms by mutating synthetic genomic barcodes during organismal development. However, these sophisticated biological tools currently use ad-hoc and outmoded computational methods to reconstruct the cell lineage tree from the mutated barcodes. Because these methods are agnostic to the biological mechanism, they are unable to take full advantage of the data’s structure. We propose a statistical model for the mutation process and develop a procedure to estimate the tree topology, branch lengths, and mutation parameters by iteratively applying penalized maximum likelihood estimation. In contrast to existing techniques, our method estimates time along each branch, rather than number of mutation events, thus providing a detailed account of tissue-type differentiation. Via simulations, we demonstrate that our method is substantially more accurate than existing approaches. Our reconstructed trees also better recapitulate known aspects of zebrafish development and reproduce similar results across fish replicates.

2014 ◽  
Vol 530-531 ◽  
pp. 768-772
Author(s):  
Guo Ping Tan ◽  
Lin Feng Tan ◽  
Lei Cao ◽  
Mei Yan Ju

For the study of the applications of partial network coding based real-time multicast protocol (PNCRM) in Mobile Ad hoc networks, the researches should be developed in the probability distribution of delay. In this paper, NS2 is used to obtain the delay of data packets through simulations. Because the delay does not obey the strict normal distribution, the maximum likelihood estimate method based on the lognormal distribution is used to process the data. Using MATLAB to obtain the actual distribution of the natural logarithm of delay, then drawing the delay distribution with the maximum likelihood estimation method based on the lognormal distribution, the conclusion that the distributions obtained by the above mentioned methods are basically consistent can be obtained. So the delay distribution of PNCRM meets the lognormal distribution and the characteristic of delay probability distribution can be estimated.


2012 ◽  
Vol 2 (1) ◽  
pp. 7 ◽  
Author(s):  
Andrzej Kijko

This work is focused on the Bayesian procedure for the estimation of the regional maximum possible earthquake magnitude <em>m</em><sub>max</sub>. The paper briefly discusses the currently used Bayesian procedure for m<sub>max</sub>, as developed by Cornell, and a statistically justifiable alternative approach is suggested. The fundamental problem in the application of the current Bayesian formalism for <em>m</em><sub>max</sub> estimation is that one of the components of the posterior distribution is the sample likelihood function, for which the range of observations (earthquake magnitudes) depends on the unknown parameter <em>m</em><sub>max</sub>. This dependence violates the property of regularity of the maximum likelihood function. The resulting likelihood function, therefore, reaches its maximum at the maximum observed earthquake magnitude <em>m</em><sup>obs</sup><sub>max</sub> and not at the required maximum <em>possible</em> magnitude <em>m</em><sub>max</sub>. Since the sample likelihood function is a key component of the posterior distribution, the posterior estimate of <em>m^</em><sub>max</sub> is biased. The degree of the bias and its sign depend on the applied Bayesian estimator, the quantity of information provided by the prior distribution, and the sample likelihood function. It has been shown that if the maximum posterior estimate is used, the bias is negative and the resulting underestimation of <em>m</em><sub>max</sub> can be as big as 0.5 units of magnitude. This study explores only the maximum posterior estimate of <em>m</em><sub>max</sub>, which is conceptionally close to the classic maximum likelihood estimation. However, conclusions regarding the shortfall of the current Bayesian procedure are applicable to all Bayesian estimators, <em>e.g.</em> posterior mean and posterior median. A simple, <em>ad hoc</em> solution of this non-regular maximum likelihood problem is also presented.


2018 ◽  
Author(s):  
Damien G. Hicks ◽  
Terence P. Speed ◽  
Mohammed Yassin ◽  
Sarah M. Russell

AbstractNew approaches to lineage tracking allow the study of cell differentiation over many generations of cells during development in multicellular organisms. Understanding the variability observed in these lineage trees requires new statistical methods. Whereas invariant cell lineages, such as that for the nematode Caenorhabditis elegans, can be described using a lineage map, defined as the fixed pattern of phenotypes overlaid onto the binary tree structure, the variability of cell lineages from higher organisms makes it impossible to draw a single lineage map. Here, we introduce lineage variability maps which describe the pattern of second-order variation throughout the lineage tree. These maps can be undirected graphs of the partial correlations between every lineal position or directed graphs showing the dynamics of bifurcated patterns in each subtree. By using the symmetry invariance of a binary tree to develop a generalized spectral analysis for cell lineages, we show how to infer these graphical models for lineages of any depth from sample sizes of only a few pedigrees. When tested on pedigrees from C. elegans expressing a marker for pharyngeal differentiation potential, the maps recover essential features of the known lineage map. When applied to highly-variable pedigrees monitoring cell size in T lymphocytes, the maps show how most of the phenotype is set by the founder naive T cell. Lineage variability maps thus elevate the concept of the lineage map to the population level, addressing questions about the potency and dynamics of cell lineages and providing a way to quantify the progressive restriction of cell fate with increasing depth in the tree.Author summaryMulticellular organisms develop from a single fertilized egg by sequential cell divisions. The progeny from these divisions adopt different traits that are transmitted and modified through many generations. By tracking how cell traits change with each successive cell division throughout the family, or lineage, tree, it has been possible to understand where and how these modifications are controlled at the single-cell level, thereby addressing questions about, for example, the developmental origin of tissues, the sources of differentiation in immune cells, or the relationship between primary tumors and metastases. Such lineages often show large variability, with apparently identical founder cells giving rise to different patterns of descendants. Fundamental scientific questions, such as about the range of possible cell types a cell can give rise to, are often about this variability. To characterize this variation, and thus understand the lineage at the population level, we introduce lineage variability maps. Using data from worm and mammalian cell lineages we show how these maps provide quantifiable answers to questions about any developing lineage, such as the potency of founder cells and the progressive restriction of cell fate at each stage in the tree.


2019 ◽  
Author(s):  
Hamim Zafar ◽  
Chieh Lin ◽  
Ziv Bar-Joseph

AbstractRecent studies combine two novel technologies, single-cell RNA-sequencing and CRISPR-Cas9 barcode editing for elucidating developmental lineages at the whole organism level. While these studies provided several insights, they face several computational challenges. First, lineages are reconstructed based on noisy and often saturated random mutation data. Additionally, due to the randomness of the mutations, lineages from multiple experiments cannot be combined to reconstruct a consensus lineage tree. To address these issues we developed a novel method, LinTIMaT, which reconstructs cell lineages using a maximum-likelihood framework by integrating mutation and expression data. Our analysis shows that expression data helps resolve the ambiguities arising in when lineages are inferred based on mutations alone, while also enabling the integration of different individual lineages for the reconstruction of a consensus lineage tree. LinTIMaT lineages have better cell type coherence, improve the functional significance of gene sets and provide new insights on progenitors and differentiation pathways.


2018 ◽  
Author(s):  
Byungjin Hwang ◽  
Wookjae Lee ◽  
Soo-Young Yum ◽  
Yujin Jeon ◽  
Namjin Cho ◽  
...  

ABSTRACTDetermining cell lineage and function is critical to understanding human physiology and pathology. Although advances in lineage tracing methods have provided new insight into cell fate, defining cellular diversity at the mammalian level remains a challenge. Here, we developed a genome editing strategy using a cytidine deaminase fused with inactive Cas9 (dCas9) to specifically target endogenous interspersed repeat regions in mammalian cells. The resulting mutation patterns served as a genetic barcode, which was induced by targeted mutagenesis with single-guide RNA (sgRNA), leveraging substitution events, and subsequent read out by a single primer pair. By analyzing interspersed mutation signatures, we show the accurate reconstruction of cell lineage using both bulk cell and single-cell data. We envision that our genetic barcode system will enable fine-resolution mapping of organismal development in healthy and diseased mammalian states.


2020 ◽  
Vol 11 (1) ◽  
Author(s):  
Hamim Zafar ◽  
Chieh Lin ◽  
Ziv Bar-Joseph

Abstract Recent studies combine two novel technologies, single-cell RNA-sequencing and CRISPR-Cas9 barcode editing for elucidating developmental lineages at the whole organism level. While these studies provided several insights, they face several computational challenges. First, lineages are reconstructed based on noisy and often saturated random mutation data. Additionally, due to the randomness of the mutations, lineages from multiple experiments cannot be combined to reconstruct a species-invariant lineage tree. To address these issues we developed a statistical method, LinTIMaT, which reconstructs cell lineages using a maximum-likelihood framework by integrating mutation and expression data. Our analysis shows that expression data helps resolve the ambiguities arising in when lineages are inferred based on mutations alone, while also enabling the integration of different individual lineages for the reconstruction of an invariant lineage tree. LinTIMaT lineages have better cell type coherence, improve the functional significance of gene sets and provide new insights on progenitors and differentiation pathways.


Sign in / Sign up

Export Citation Format

Share Document