Fusion of data-driven model and mechanistic model for kiwifruit flesh firmness prediction

Risk-mitigation strategies are most effective when the major sources of uncertainty are determined through dedicated and in-depth studies. In the context of reservoir characterization and modeling, petrophysical uncertainty plays a significant role in the risk assessment phase, for instance in the computation of volumetrics. The conventional workflow for the propagation of the petrophysical uncertainty consists of physics-based model embedded into a Monte Carlo (MC) template. In detail, open-hole logs and their inherent uncertainties are used to estimate the important petrophysical properties (e.g. shale volume, porosity, water saturation) with uncertainty through the mechanistic model and MC simulations. In turn, model parameter uncertainties can be also considered. This standard approach can be highly time-consuming in case the physics-based model is complex, unknown, difficult to reproduce (e.g. old/legacy wells) and/or the number of wells to be processed is very high. In this respect, the aim of this paper is to show how a data-driven methodology can be used to propagate the petrophysical uncertainty in a fast and efficient way, speeding-up the complete process but still remaining consistent with the main outcomes. In detail, a fit-for-purpose Random Forest (RF) algorithm learns through experience how log measurements are related to the important petrophysical parameters. Then, a MC framework is used to infer the petrophysical uncertainty starting from the uncertainty of the input logs, still with the RF model as a driver. The complete methodology, first validated with ad-hoc synthetic case studies, has been then applied to two real cases, where the petrophysical uncertainty has been required for reservoir modeling purposes. The first one includes legacy wells intercepting a very complex lithological environment. The second case comprises a sandstone reservoir with a very high number of wells, instead. For both scenarios, the standard approach would have taken too long (several months) to be completed, with no possibility to integrate the results into the reservoir models in time. Hence, for each well the RF regressor has been trained and tested on the whole dataset available, obtaining a valid data-driven analytics model for formation evaluation. Next, 1000 scenarios of input logs have been generated via MC simulations using multivariate normal distributions. Finally, the RF regressor predicts the associated 1000 petrophysical characterization scenarios. As final outcomes of the workflow, ad-hoc statistics (e.g. P10, P50, P90 quantiles) have been used to wrap up the main findings. The complete data-driven approach took few days for both scenarios with a critical impact on the subsequent reservoir modeling activities. This study opens the possibility to quickly process a high number of wells and, in particular, it can be also used to effectively propagate the petrophysical uncertainty to legacy well data for which conventional approaches are not an option, in terms of time-efficiency.

Download Full-text

Data-driven analysis of a mechanistic model of CAR T cell signaling predicts effects of cell-to-cell heterogeneity

Journal of Theoretical Biology ◽

10.1016/j.jtbi.2019.110125 ◽

2020 ◽

Vol 489 ◽

pp. 110125 ◽

Cited By ~ 1

Author(s):

Colin G. Cess ◽

Stacey D. Finley

Keyword(s):

T Cell ◽

Cell Signaling ◽

Mechanistic Model ◽

Data Driven ◽

Cell Heterogeneity ◽

T Cell Signaling ◽

Car T Cell ◽

Car T

Download Full-text

Data-Driven Analysis of a Mechanistic Model of CAR T Cell Signaling Predicts Effects of Cell-to-cell Heterogeneity

10.1101/808626 ◽

2019 ◽

Author(s):

Colin G. Cess ◽

Stacey D. Finley

Keyword(s):

T Cell ◽

Protein Expression ◽

Cell Signaling ◽

Mechanistic Model ◽

Cell Cancer ◽

Data Driven ◽

Cell Heterogeneity ◽

T Cell Signaling ◽

Car T Cell ◽

Car T

ABSTRACTDue to the variability of protein expression, cells of the same population can exhibit different responses to stimuli. It is important to understand this heterogeneity at the individual level, as population averages mask these underlying differences. Using computational modeling, we can interrogate a system much more precisely than by using experiments alone, in order to learn how the expression of each protein affects a biological system. Here, we examine a mechanistic model of CAR T cell signaling, which connects receptor-antigen binding to MAPK activation, to determine intracellular modulations that can increase cellular response. CAR T cell cancer therapy involves removing a patient’s T cells, modifying them to express engineered receptors that can bind to tumor-associated antigens to promote cell killing, and then injecting the cells back into the patient. This population of cells, like all cell populations, would have heterogeneous protein expression, which could affect the efficacy of treatment. Thus, it is important to examine the effects of cell-to-cell heterogeneity. We first generated a dataset of simulated cell responses via Monte Carlo simulations of the mechanistic model, where the initial protein concentrations were randomly sampled. We analyzed the dataset using partial least-squares modeling to determine the relationships between protein expression and ERK phosphorylation, the output of the mechanistic model. Using this data-driven analysis, we found that only the expressions of proteins relating directly to the receptor and the MAPK cascade, the beginning and end of the network, respectively, are relevant to the cells’ response. We also found, surprisingly, that increasing the amount of receptor present can actually inhibit the cell’s ability to respond due to increasing the strength of negative feedback from phosphatases. Overall, we have combined data-driven and mechanistic modeling to generate detailed insight into CAR T cell signaling.

Download Full-text

Technology Focus: Reservoir Simulation (July 2021)

Journal of Petroleum Technology ◽

10.2118/0721-0043-jpt ◽

2021 ◽

Vol 73 (07) ◽

pp. 43-43

Author(s):

Mark Burgoyne

Keyword(s):

Machine Learning ◽

Numerical Simulation ◽

Reservoir Simulation ◽

Mechanistic Model ◽

Deep Understanding ◽

Decision Makers ◽

Data Driven ◽

Gas Reservoirs ◽

New Ways Of Working ◽

Decline Curve

In reviewing the long list of papers this year, it has become apparent to me that the hot topic in reservoir simulation these days is the application of data analytics or machine learning to numerical simulation and with it quite often the promise of data-driven work flows—code for needing to think about the physics less. Data-driven work flows have their place, especially when we have a lot of data and the system is very complex. I’m thinking shales especially, but seeing it being applied to more conventional reservoirs gave me a moment of pause. I can’t help but think that, in terms of the hype cycle as related to the application of machine learning to numerical simulation, we may be approaching the peak of inflated expectations. I say “approaching,” because many companies appear to be dipping their toes in the water, perhaps because they think they should, but few are truly committing to it. Many vocal champions of the approach exist, but most decision-makers just don’t understand it yet. If we cannot explain how something works simply, then thoughtful leaders will tend not to trust it. Whether it be numerical-simulation findings or self-organizing neural networks, the need will always exist for a deep understanding and clarity of explanation of both the discipline and method used. To decision-makers, it will be an attractive concept, but they will generally ask to validate against more traditional methods. I look forward to a future when we are through the trough of disillusionment and start climbing the slope of enlightenment to a new level of productivity. I suspect, though, that it will take at least another 5 years, as our current crop of knowledgeable evangelists become decision-makers themselves and can put in place work flows and teams to leverage the approach appropriately for their problems, intelligently leveraging their years of hard-won experience. I will lay a wager with you, though, that when that time comes, those new ways of working more efficiently will rely just as much if not more upon a deep understanding of reservoir engineering as our current methods. I hope you enjoy these papers, which include examples of both the new approach as well as tried-and-true approaches. Recommended additional reading at OnePetro: www.onepetro.org. SPE 202436 - Fast Modeling of Gas Reservoirs Using Proper Orthogonal Decomposition/Radial Basis Function (POD/RBF) Nonintrusive Reduced-Order Modeling by Jemimah-Sandra Samuel, Imperial College London, et al. IPTC 21417 - A New Methodology for Calculating Wellbore Shut-In Pressure in Numerical Reservoir Simulations by Babatope Kayode, Saudi Aramco, et al. SPE 201658 - Mechanistic Model Validation of Decline Curve Analysis for Unconventional Reservoirs by Mikhail Gorditsa, Texas A&M University, et al.

Download Full-text

Combined mechanistic and genetic programming approach to modeling pilot NBR production: influence of feed compositions on rubber Mooney viscosity

RSC Advances ◽

10.1039/d0ra07257e ◽

2021 ◽

Vol 11 (2) ◽

pp. 817-829

Author(s):

Ge He ◽

Tao Luo ◽

Yagu Dang ◽

Li Zhou ◽

Yiyang Dai ◽

...

Keyword(s):

Genetic Programming ◽

Emulsion Polymerization ◽

Process Model ◽

Mechanistic Model ◽

Polymerization Kinetics ◽

Data Driven ◽

Programming Approach ◽

Process Conditions ◽

Model Based ◽

Mooney Viscosity

The process model comprised of a mechanistic model based on emulsion polymerization kinetics and a data-driven model derived from genetic programming is developed to correlate the feed compositions and process conditions to NBR Mooney viscosity.

Download Full-text

Estimation of Cuttings Concentration and Frictional Pressure Losses During Drilling Using Data-Driven Models

10.1115/omae2021-63653 ◽

2021 ◽

Author(s):

Murat Ozbayoglu ◽

Evren Ozbayoglu ◽

Baris Guney Ozdilli ◽

Oney Erge

Keyword(s):

Oil And Gas ◽

Mechanistic Model ◽

Data Driven ◽

P Value ◽

Cuttings Transport ◽

Frictional Loss ◽

Pressure Losses ◽

Dimensionless Groups ◽

Wide Range ◽

Pipe Rotation

Abstract Drilling practice has been evolving parallel to the developments in the oil and gas industry. Current supply and demand for oil and gas dictate search for hydrocarbons either at much deeper and hard-to-reach fields, or at unconventional fields, both requiring extended reach wells, long horizontal sections, and 3D complex trajectories. Cuttings transport is one of the most challenging problems while drilling such wells, especially at mid-range inclinations. For many years, numerous studies have been conducted to address modeling of cuttings transport, estimation of the concentration of cuttings as well as pressure losses inside the wellbores, considering various drilling variables having influence on the process. However, such attempts, either mechanistic or empirical, have many limitations due to various simplifications and assumptions made during the development stage. Fluid thixotropy, temperature variations in the wellbore, uncertainty in pipe eccentricity as well as chaotic motion of cuttings due to pipe rotation, imperfections in the wellbore walls, variations in the size and shape of the cuttings, presence of tool joints on the drillstring, etc. causes the modeling of the problem extremely difficult. Due to the complexity of the process, the estimations are usually not very accurate, or not reliable. In this study, data-driven models are used to address the estimation of cuttings concentration and frictional loss estimation in a well during drilling operations, instead of using mechanistic or empirical methods. The selected models include Artificial Neural Networks, Random Forest, and AdaBoost. The training of the models is determined using the experimental data regarding cuttings transport tests collected in the last 40 years at The University of Tulsa – Drilling Research Projects, which includes a wide range of wellbore and pipe sizes, inclinations, ROPs, pipe rotation speeds, flow rates, fluid and cuttings properties. The evaluation of the models is conducted using Root Mean Square Error, R-Squared Values, and P-Value. As the inputs of the data-driven models, independent drilling variables are directly used. Also, as a second approach, dimensionless groups are developed based on these independent drilling variables, and these dimensionless groups are used as the inputs of the models. Moreover, performance of the data-driven model results are compared with the results of a conventional mechanistic model. It is observed that in many cases, data-driven models perform significantly better than the mechanistic model, which provides a very promising direction to consider for real time drilling optimization and automation. It is also concluded that using the independent drilling variables directly as the model inputs provided more accurate results when compared with dimensional groups are used as the model inputs.

Download Full-text

Time-varying optimization of COVID-19 vaccine prioritization in the context of limited vaccination capacity

Nature Communications ◽

10.1038/s41467-021-24872-5 ◽

2021 ◽

Vol 12 (1) ◽

Author(s):

Shasha Han ◽

Jun Cai ◽

Juan Yang ◽

Juanjuan Zhang ◽

Qianhui Wu ◽

...

Keyword(s):

Early Phase ◽

Reproduction Number ◽

Mechanistic Model ◽

Vaccination Campaign ◽

Data Driven ◽

Vaccination Program ◽

Time Varying ◽

New Wave ◽

Target Groups ◽

Epidemiological Situation

AbstractDynamically adapting the allocation of COVID-19 vaccines to the evolving epidemiological situation could be key to reduce COVID-19 burden. Here we developed a data-driven mechanistic model of SARS-CoV-2 transmission to explore optimal vaccine prioritization strategies in China. We found that a time-varying vaccination program (i.e., allocating vaccines to different target groups as the epidemic evolves) can be highly beneficial as it is capable of simultaneously achieving different objectives (e.g., minimizing the number of deaths and of infections). Our findings suggest that boosting the vaccination capacity up to 2.5 million first doses per day (0.17% rollout speed) or higher could greatly reduce COVID-19 burden, should a new wave start to unfold in China with reproduction number ≤1.5. The highest priority categories are consistent under a broad range of assumptions. Finally, a high vaccination capacity in the early phase of the vaccination campaign is key to achieve large gains of strategic prioritizations.

Download Full-text

Dynamic optimization of COVID-19 vaccine prioritization in the context of limited supply

10.21203/rs.3.rs-257573/v1 ◽

2021 ◽

Author(s):

Hongjie Yu ◽

Shasha Han ◽

Jun Cai ◽

Juan Yang ◽

Juanjuan Zhang ◽

...

Keyword(s):

Vaccine Efficacy ◽

Initial Phase ◽

Reproduction Number ◽

Mechanistic Model ◽

Vaccination Campaign ◽

Data Driven ◽

Vaccine Hesitancy ◽

New Wave ◽

Limited Supply ◽

Epidemiological Situation

Abstract Strategic prioritization of COVID-19 vaccines is urgently needed, especially in light of the limited supply that is expected to last for most, if not the entire, 2021. Dynamically adapting the allocation strategy to the evolving epidemiological situation could thus be critical during this initial phase of vaccine rollout. We developed a data-driven mechanistic model of SARS-CoV-2 transmission to explore optimal vaccine prioritization strategies in China that aim at reducing COVID-19 burden measured through different metrics. We found that reactively adapting the vaccination program to the epidemiological situation (i.e., allocate vaccine to a target group before reaching full coverage of other groups with initial higher priority) can be highly beneficial as such strategies are capable to simultaneously achieve different objectives (e.g., minimizing the number of deaths and of infections). The highest priority categories are broadly consistent under different hypotheses about vaccine efficacy, differential vaccine efficacy in preventing infection vs. disease, vaccine hesitancy, and SARS-CoV-2 transmissibility. Our findings also suggest that boosting the daily capacities up to 2.5 million courses (0.17% rollout speed) or higher could greatly reduce COVID-19 burden should a new wave start to unfold in China with reproduction number equal to 1.5 or lower. Finally, we estimate that a high vaccine supply in the early phase of the vaccination campaign is key to achieve large gains of strategic prioritizations.

Download Full-text

Adaptive optimal control for a wastewater treatment plant based on a data-driven method

Water Science & Technology ◽

10.2166/wst.2013.087 ◽

2013 ◽

Vol 67 (10) ◽

pp. 2314-2320 ◽

Cited By ~ 6

Author(s):

Jun-Fei Qiao ◽

Ying-Chun Bo ◽

Wei Chai ◽

Hong-Gui Han

Keyword(s):

Wastewater Treatment ◽

Energy Consumption ◽

Wastewater Treatment Plant ◽

Mechanistic Model ◽

Full Range ◽

Treatment Plant ◽

Operating Conditions ◽

Data Driven ◽

Evaluation Module ◽

Operating Points

In order to optimize the operating points of the dissolved oxygen concentration and the nitrate level in a wastewater treatment plant (WWTP) benchmark, a data-driven adaptive optimal controller (DDAOC) based on adaptive dynamical programming is proposed. This DDAOC consists of an evaluation module and an optimization module. When a certain group of operating points is given, first the evaluation module estimates the energy consumption and the effluent quality in the future under this policy, and then the optimization module adjusts the operating points according to the evaluation result generated by the evaluation module. The optimal operating points will be found gradually as this process continues repeatedly. During the optimization, only the input–output data measured from the plant are needed, while a mechanistic model is unnecessary. The DDAOC is tested and evaluated on BSM1 (Benchmark Simulation Model No.1), and its performance is compared to the performance of a proportional-integral-derivative (PID) controller with fixed operating points under the full range of operating conditions. The results show that DDAOC can reduce the energy consumption significantly.

Download Full-text

Advancements in sensor technology and decision support intelligent tools to assist smart livestock farming

Journal of Animal Science ◽

10.1093/jas/skab038 ◽

2021 ◽

Vol 99 (2) ◽

Author(s):

Luis O Tedeschi ◽

Paul L Greenwood ◽

Ilan Halachmi

Keyword(s):

Data Collection ◽

Data Storage ◽

Learning Process ◽

Production Efficiency ◽

Mechanistic Model ◽

Livestock Production ◽

Data Driven ◽

Current Status ◽

Sensor Technology ◽

Livestock Farming

Abstract Remote monitoring, modern data collection through sensors, rapid data transfer, and vast data storage through the Internet of Things (IoT) have advanced precision livestock farming (PLF) in the last 20 yr. PLF is relevant to many fields of livestock production, including aerial- and satellite-based measurement of pasture’s forage quantity and quality; body weight and composition and physiological assessments; on-animal devices to monitor location, activity, and behaviors in grazing and foraging environments; early detection of lameness and other diseases; milk yield and composition; reproductive measurements and calving diseases; and feed intake and greenhouse gas emissions, to name just a few. There are many possibilities to improve animal production through PLF, but the combination of PLF and computer modeling is necessary to facilitate on-farm applicability. Concept- or knowledge-driven (mechanistic) models are established on scientific knowledge, and they are based on the conceptualization of hypotheses about variable interrelationships. Artificial intelligence (AI), on the other hand, is a data-driven approach that can manipulate and represent the big data accumulated by sensors and IoT. Still, it cannot explicitly explain the underlying assumptions of the intrinsic relationships in the data core because it lacks the wisdom that confers understanding and principles. The lack of wisdom in AI is because everything revolves around numbers. The associations among the numbers are obtained through the “automatized” learning process of mathematical correlations and covariances, not through “human causation” and abstract conceptualization of physiological or production principles. AI starts with comparative analogies to establish concepts and provides memory for future comparisons. Then, the learning process evolves from seeking wisdom through the systematic use of reasoning. AI is a relatively novel concept in many science fields. It may well be “the missing link” to expedite the transition of the traditional maximizing output mentality to a more mindful purpose of optimizing production efficiency while alleviating resource allocation for production. The integration between concept- and data-driven modeling through parallel hybridization of mechanistic and AI models will yield a hybrid intelligent mechanistic model that, along with data collection through PLF, is paramount to transcend the current status of livestock production in achieving sustainability.

Download Full-text