scholarly journals Integrating multimodal data sets into a mathematical framework to describe and predict therapeutic resistance in cancer

Author(s):  
Kaitlyn Johnson ◽  
Grant R. Howard ◽  
Daylin Morgan ◽  
Eric A. Brenner ◽  
Andrea L. Gardner ◽  
...  

SummaryA significant challenge in the field of biomedicine is the development of methods to integrate the multitude of dispersed data sets into comprehensive frameworks to be used to generate optimal clinical decisions. Recent technological advances in single cell analysis allow for high-dimensional molecular characterization of cells and populations, but to date, few mathematical models have attempted to integrate measurements from the single cell scale with other data types. Here, we present a framework that actionizes static outputs from a machine learning model and leverages these as measurements of state variables in a dynamic mechanistic model of treatment response. We apply this framework to breast cancer cells to integrate single cell transcriptomic data with longitudinal population-size data. We demonstrate that the explicit inclusion of the transcriptomic information in the parameter estimation is critical for identification of the model parameters and enables accurate prediction of new treatment regimens. Inclusion of the transcriptomic data improves predictive accuracy in new treatment response dynamics with a concordance correlation coefficient (CCC) of 0.89 compared to a prediction accuracy of CCC = 0.79 without integration of the single cell RNA sequencing (scRNA-seq) data directly into the model calibration. To the best our knowledge, this is the first work that explicitly integrates single cell clonally-resolved transcriptome datasets with longitudinal treatment response data into a mechanistic mathematical model of drug resistance dynamics. We anticipate this approach to be a first step that demonstrates the feasibility of incorporating multimodal data sets into identifiable mathematical models to develop optimized treatment regimens from data.

2019 ◽  
Vol 21 (5) ◽  
pp. 1717-1732 ◽  
Author(s):  
Xu Chi ◽  
Maureen A Sartor ◽  
Sanghoon Lee ◽  
Meenakshi Anurag ◽  
Snehal Patil ◽  
...  

Abstract Identifying new gene functions and pathways underlying diseases and biological processes are major challenges in genomics research. Particularly, most methods for interpreting the pathways characteristic of an experimental gene list defined by genomic data are limited by their dependence on assessing the overlapping genes or their interactome topology, which cannot account for the variety of functional relations. This is particularly problematic for pathway discovery from single-cell genomics with low gene coverage or interpreting complex pathway changes such as during change of cell states. Here, we exploited the comprehensive sets of molecular concepts that combine ontologies, pathways, interactions and domains to help inform the functional relations. We first developed a universal concept signature (uniConSig) analysis for genome-wide quantification of new gene functions underlying biological or pathological processes based on the signature molecular concepts computed from known functional gene lists. We then further developed a novel concept signature enrichment analysis (CSEA) for deep functional assessment of the pathways enriched in an experimental gene list. This method is grounded on the framework of shared concept signatures between gene sets at multiple functional levels, thus overcoming the limitations of the current methods. Through meta-analysis of transcriptomic data sets of cancer cell line models and single hematopoietic stem cells, we demonstrate the broad applications of CSEA on pathway discovery from gene expression and single-cell transcriptomic data sets for genetic perturbations and change of cell states, which complements the current modalities. The R modules for uniConSig analysis and CSEA are available through https://github.com/wangxlab/uniConSig.


2021 ◽  
Author(s):  
Igor Nesteruk

ABSTRACTBackgroundTo simulate how the number of COVID-19 cases increases versus time, various data sets for the number of new cases and different mathematical models can be used. Since there are some differences in statistical data, the results of simulations can be different. Complex mathematical models contain many unknown parameters, the values of which must be determined using a limited number of observations of the disease over time. Even long-term monitoring of the epidemic may not provide reliable estimates of its parameters due to the constant change of testing conditions, isolation of infected and quarantine. Therefore, simpler approaches are necessary. In particular, previous simulations of the COVID-19 epidemic dynamics in Ukraine were based on smoothing of the dependence of the number of cases on time and the generalized SIR (susceptible-infected-removed) model. These approaches allowed to detect the waves of pandemic and to make adequate predictions of the their duration and final sizes. In particular, eight waves of the COVID-19 pandemic in Ukraine were investigated.ObjectiveWe will compare the results simulation of a new epidemic wave in Ukraine based on national statistics and data reported by Johns Hopkins University (JHU).MethodsIn this study we use the smoothing method for the dependences of the number of cases on time, the generalized SIR model for the dynamics of any epidemic wave, the exact solution of the linear differential equations, and statistical approach developed before.ResultsNinth epidemic wave in Ukraine was simulated. The optimal values of the SIR model parameters were calculated and compared with the use of two data sets. Both predictions are not very optimistic: new cases will not stop appearing until June-July 2021.ConclusionsNew waves of COVID-19 pandemic can be detected, calculated and predicted with the use of rather simple mathematical models. The results of calculations depend on the data sets for the number of confirmed cases. The expected long duration of the pandemic forces us to be careful and in solidarity. The government and all Ukrainians must strictly adhere to quarantine measures in order to avoid fatal consequences. Probably the presented results could be useful in order to estimate the efficiency of future vaccinations.


2021 ◽  
Vol 16 ◽  
pp. 63-78
Author(s):  
Karthik Alasakani ◽  
Radhika S.l. Tantravahi ◽  
Praveen Kumar Ptv

In this paper, we worked on methods to reduce the input data set to the mathematical models developed to simulate blood flow through human arteries. In general, any mathematical model designed to mimic a natural process needs specific information on its model parameters. In our models, the inputs to these parameters are from the human arterial system, i.e., the anatomical data on arteries and physiological data on blood. Besides these, there are few other parameters in the models describing mechanisms, such as the pulsatile nature of the blood flow and the arteries' elastic behavior. These mechanisms described using mathematical relations help assign values to the parameters that satisfy mathematical specifications or requirements. However, with this method of assigning values, there is a possibility that some of the data sets constructed simulate the same state of the system (arterial system) even though the values assigned significantly differ from each other in magnitude. Moreover, identifying such data sets is not an apparent task but requires robust procedures. Thus, in this work, we attempt to shed light on a data size reduction technique to identify all such model parameters' in-significant values and eliminate them from the input data set. We propose the statistical testing procedure to identify a significant difference in the dependent variables' values (whose values are computed using the mathematical models) with the independent variables (the model parameters). This novel approach could efficiently identify the inputs mimicking similar arterial system states and build a refined input data set.


2021 ◽  
Vol 5 (1) ◽  
pp. 37-46
Author(s):  
Igor Nesteruk ◽  
Noureddine Benlagha

Background. To simulate how the number of COVID-19 cases increases versus time, various data sets and different mathematical models can be used. Since there are some differences in statistical data, the results of simulations can be different. Complex mathematical models contain many unknown parameters, the values ​​of which must be determined using a limited number of observations of the disease over time. Even long-term monitoring of the epidemic may not provide reliable estimates of the model parameters due to the constant change of testing conditions, isolation of infected, quarantine conditions, pathogen mutations, vaccinations, etc. Therefore, simpler approaches are necessary. In particular, previous simulations of the COVID-19 epidemic dynamics in Ukraine were based on smoothing of the dependence of the number of cases on time and the generalized SIR (susceptible–infected–removed) model. These approaches allowed detecting the pandemic waves and calculating adequate predictions of their duration and final sizes. In particular, eight waves of the COVID-19 pandemic in Ukraine were investigated. Objective. We aimed to detect the changes in the pandemic dynamics and present the results of SIR simu­lations based on Ukrainian national statistics and data reported by Johns Hopkins University (JHU) for Ukraine and Qatar. Methods. In this study we use the smoothing method for the dependences of the number of cases on time, the generalized SIR model for the dynamics of any epidemic wave, the exact solution of the linear differential equations, and statistical approach for the model parameter identification developed before. Results. The optimal values of the SIR model parameters were calculated and some predictions about final sizes and durations of the epidemics are presented. Corresponding SIR curves are shown and compared with the real numbers of cases. Conclusions. Unfortunately, the forecasts are not very optimistic: in Ukraine, new cases will not stop appearing until June–July 2021; in Qatar, new cases are likely to appear throughout 2021. The expected long duration of the pandemic forces us to be careful and in solidarity. Probably the presented results could be useful in order to estimate the efficiency of vaccinations.


Cells ◽  
2021 ◽  
Vol 10 (6) ◽  
pp. 1516
Author(s):  
Daniel Gratz ◽  
Alexander J Winkle ◽  
Seth H Weinberg ◽  
Thomas J Hund

The voltage-gated Na+ channel Nav1.5 is critical for normal cardiac myocyte excitability. Mathematical models have been widely used to study Nav1.5 function and link to a range of cardiac arrhythmias. There is growing appreciation for the importance of incorporating physiological heterogeneity observed even in a healthy population into mathematical models of the cardiac action potential. Here, we apply methods from Bayesian statistics to capture the variability in experimental measurements on human atrial Nav1.5 across experimental protocols and labs. This variability was used to define a physiological distribution for model parameters in a novel model formulation of Nav1.5, which was then incorporated into an existing human atrial action potential model. Model validation was performed by comparing the simulated distribution of action potential upstroke velocity measurements to experimental measurements from several different sources. Going forward, we hope to apply this approach to other major atrial ion channels to create a comprehensive model of the human atrial AP. We anticipate that such a model will be useful for understanding excitability at the population level, including variable drug response and penetrance of variants linked to inherited cardiac arrhythmia syndromes.


2021 ◽  
Vol 12 (2) ◽  
pp. 317-334
Author(s):  
Omar Alaqeeli ◽  
Li Xing ◽  
Xuekui Zhang

Classification tree is a widely used machine learning method. It has multiple implementations as R packages; rpart, ctree, evtree, tree and C5.0. The details of these implementations are not the same, and hence their performances differ from one application to another. We are interested in their performance in the classification of cells using the single-cell RNA-Sequencing data. In this paper, we conducted a benchmark study using 22 Single-Cell RNA-sequencing data sets. Using cross-validation, we compare packages’ prediction performances based on their Precision, Recall, F1-score, Area Under the Curve (AUC). We also compared the Complexity and Run-time of these R packages. Our study shows that rpart and evtree have the best Precision; evtree is the best in Recall, F1-score and AUC; C5.0 prefers more complex trees; tree is consistently much faster than others, although its complexity is often higher than others.


2020 ◽  
Vol 22 (Supplement_3) ◽  
pp. iii406-iii406
Author(s):  
Andrew Donson ◽  
Kent Riemondy ◽  
Sujatha Venkataraman ◽  
Ahmed Gilani ◽  
Bridget Sanford ◽  
...  

Abstract We explored cellular heterogeneity in medulloblastoma using single-cell RNA sequencing (scRNAseq), immunohistochemistry and deconvolution of bulk transcriptomic data. Over 45,000 cells from 31 patients from all main subgroups of medulloblastoma (2 WNT, 10 SHH, 9 GP3, 11 GP4 and 1 GP3/4) were clustered using Harmony alignment to identify conserved subpopulations. Each subgroup contained subpopulations exhibiting mitotic, undifferentiated and neuronal differentiated transcript profiles, corroborating other recent medulloblastoma scRNAseq studies. The magnitude of our present study builds on the findings of existing studies, providing further characterization of conserved neoplastic subpopulations, including identification of a photoreceptor-differentiated subpopulation that was predominantly, but not exclusively, found in GP3 medulloblastoma. Deconvolution of MAGIC transcriptomic cohort data showed that neoplastic subpopulations are associated with major and minor subgroup subdivisions, for example, photoreceptor subpopulation cells are more abundant in GP3-alpha. In both GP3 and GP4, higher proportions of undifferentiated subpopulations is associated with shorter survival and conversely, differentiated subpopulation is associated with longer survival. This scRNAseq dataset also afforded unique insights into the immune landscape of medulloblastoma, and revealed an M2-polarized myeloid subpopulation that was restricted to SHH medulloblastoma. Additionally, we performed scRNAseq on 16,000 cells from genetically engineered mouse (GEM) models of GP3 and SHH medulloblastoma. These models showed a level of fidelity with corresponding human subgroup-specific neoplastic and immune subpopulations. Collectively, our findings advance our understanding of the neoplastic and immune landscape of the main medulloblastoma subgroups in both humans and GEM models.


Mathematics ◽  
2021 ◽  
Vol 9 (16) ◽  
pp. 1850
Author(s):  
Rashad A. R. Bantan ◽  
Farrukh Jamal ◽  
Christophe Chesneau ◽  
Mohammed Elgarhy

Unit distributions are commonly used in probability and statistics to describe useful quantities with values between 0 and 1, such as proportions, probabilities, and percentages. Some unit distributions are defined in a natural analytical manner, and the others are derived through the transformation of an existing distribution defined in a greater domain. In this article, we introduce the unit gamma/Gompertz distribution, founded on the inverse-exponential scheme and the gamma/Gompertz distribution. The gamma/Gompertz distribution is known to be a very flexible three-parameter lifetime distribution, and we aim to transpose this flexibility to the unit interval. First, we check this aspect with the analytical behavior of the primary functions. It is shown that the probability density function can be increasing, decreasing, “increasing-decreasing” and “decreasing-increasing”, with pliant asymmetric properties. On the other hand, the hazard rate function has monotonically increasing, decreasing, or constant shapes. We complete the theoretical part with some propositions on stochastic ordering, moments, quantiles, and the reliability coefficient. Practically, to estimate the model parameters from unit data, the maximum likelihood method is used. We present some simulation results to evaluate this method. Two applications using real data sets, one on trade shares and the other on flood levels, demonstrate the importance of the new model when compared to other unit models.


Author(s):  
Sarah K Cimino ◽  
Kristen K. Ciombor ◽  
A Bapsi Chakravarthy ◽  
Christina E. Bailey ◽  
M Benjamin Hopkins ◽  
...  

Sign in / Sign up

Export Citation Format

Share Document