scholarly journals Integration of Machine Learning and Computational Fluid Dynamics to Develop Turbulence Models for Improved Turbine Wake Mixing Prediction

Author(s):  
Harshal D. Akolekar ◽  
Yaomin Zhao ◽  
Richard D. Sandberg ◽  
Roberto Pacciani

Abstract This paper presents development of accurate turbulence closures for wake mixing prediction by integrating a machine-learning approach with Reynolds Averaged Navier-Stokes (RANS)-based computational fluid dynamics (CFD). The data-driven modeling framework is based on the gene expression programming (GEP) approach previously shown to generate non-linear RANS models with good accuracy. To further improve the performance and robustness of the data-driven closures, here we exploit that GEP produces tangible models to integrate RANS in the closure development process. Specifically, rather than using as cost function a comparison of the GEP-based closure terms with a frozen high-fidelity dataset, each GEP model is instead automatically implemented into a RANS solver and the subsequent calculation results compared with reference data. By first using a canonical turbine wake with inlet conditions prescribed based on high-fidelity data, we demonstrate that the CFD-driven machine-learning approach produces non-linear turbulence closures that are physically correct, i.e. predict the right downstream wake development and maintain an accurate peak wake loss throughout the domain. We then extend our analysis to full turbine-blade cases and show that the model development is sensitive to the training region due to the presence of deterministic unsteadiness in the near wake region. Models developed including this region have artificially large diffusion coefficients to over-compensate for the vortex shedding steady RANS cannot capture. In contrast, excluding the near wake region in the model development produces the correct physical model behavior, but predictive accuracy in the near-wake remains unsatisfactory. We show that this can be remedied by using the physically consistent models in unsteady RANS, implying that the non-linear closure producing the best predictive accuracy depends on whether it will be deployed in RANS or unsteady RANS calculations. Overall, the models developed with the CFD-assisted machine learning approach were found to be robust and capture the correct physical behavior across different operating conditions.

2021 ◽  
pp. 1-12
Author(s):  
Harshal D Akolekar ◽  
Yaomin Zhao ◽  
Richard Sandberg ◽  
Roberto Pacciani

Abstract This paper presents the development of accurate turbulence closures for low-pressure turbine (LPT) wake mixing prediction by integrating a machine-learning approach based on gene expression programming (GEP), with Reynolds Averaged Navier-Stokes (RANS) based computational fluid dynamics (CFD). In order to further improve the performance and robustness of GEP-based data-driven closures, the fitness of models is evaluated by running RANS calculations in an integrated way, instead of an algebraic function. Using a canonical turbine wake with inlet conditions prescribed based on high-fidelity data of the T106A cascade, we demonstrate that the ‘CFD-driven’ machine-learning approach produces physically correct non-linear turbulence closures, i.e., predict the right down-stream wake development and maintain an accurate peak wake loss throughout the domain. We then extend our analysis to full turbine blade cases and show that the model development is sensitive to the training region due to the presence of deterministic unsteadiness in the near-wake. Models developed including the near-wake have artificially large diffusion coefficients to over-compensate for the vortex shedding steady RANS cannot capture. In contrast, excluding the near-wake in the model development produces the correct physical model behavior, but predictive accuracy in the near-wake remains unsatisfactory. This can be remedied by using the physically consistent models in unsteady RANS. Overall, the ‘CFD-driven’ models were found to be robust and capture the correct physical wake mixing behavior across different LPT operating conditions and airfoils such as T106C and PakB.


Author(s):  
Arvind Pandey ◽  
Shipra Shukla ◽  
Krishna Kumar Mohbey

Background: Large financial companies are perpetually creating and updating customer scoring techniques. From a risk management view, this research for the predictive accuracy of probability is of vital importance than the traditional binary result of classification, i.e., non-credible and credible customers. The customer's default payment in Taiwan is explored for the case study. Objective: The aim is to audit the comparison between the predictive accuracy of the probability of default with various techniques of statistics and machine learning. Method: In this paper, nine predictive models are compared from which the results of the six models are taken into consideration. Deep learning-based H2O, XGBoost, logistic regression, gradient boosting, naïve Bayes, logit model, and probit regression comparative analysis is performed. The software tools such as R and SAS (university edition) is employed for machine learning and statistical model evaluation. Results: Through the experimental study, we demonstrate that XGBoost performs better than other AI and ML algorithms. Conclusion: Machine learning approach such as XGBoost effectively used for credit scoring, among other data mining and statistical approaches.


10.2196/23948 ◽  
2021 ◽  
Vol 23 (4) ◽  
pp. e23948
Author(s):  
Yuanfang Chen ◽  
Liu Ouyang ◽  
Forrest S Bao ◽  
Qian Li ◽  
Lei Han ◽  
...  

Background Effectively and efficiently diagnosing patients who have COVID-19 with the accurate clinical type of the disease is essential to achieve optimal outcomes for the patients as well as to reduce the risk of overloading the health care system. Currently, severe and nonsevere COVID-19 types are differentiated by only a few features, which do not comprehensively characterize the complicated pathological, physiological, and immunological responses to SARS-CoV-2 infection in the different disease types. In addition, these type-defining features may not be readily testable at the time of diagnosis. Objective In this study, we aimed to use a machine learning approach to understand COVID-19 more comprehensively, accurately differentiate severe and nonsevere COVID-19 clinical types based on multiple medical features, and provide reliable predictions of the clinical type of the disease. Methods For this study, we recruited 214 confirmed patients with nonsevere COVID-19 and 148 patients with severe COVID-19. The clinical characteristics (26 features) and laboratory test results (26 features) upon admission were acquired as two input modalities. Exploratory analyses demonstrated that these features differed substantially between two clinical types. Machine learning random forest models based on all the features in each modality as well as on the top 5 features in each modality combined were developed and validated to differentiate COVID-19 clinical types. Results Using clinical and laboratory results independently as input, the random forest models achieved >90% and >95% predictive accuracy, respectively. The importance scores of the input features were further evaluated, and the top 5 features from each modality were identified (age, hypertension, cardiovascular disease, gender, and diabetes for the clinical features modality, and dimerized plasmin fragment D, high sensitivity troponin I, absolute neutrophil count, interleukin 6, and lactate dehydrogenase for the laboratory testing modality, in descending order). Using these top 10 multimodal features as the only input instead of all 52 features combined, the random forest model was able to achieve 97% predictive accuracy. Conclusions Our findings shed light on how the human body reacts to SARS-CoV-2 infection as a unit and provide insights on effectively evaluating the disease severity of patients with COVID-19 based on more common medical features when gold standard features are not available. We suggest that clinical information can be used as an initial screening tool for self-evaluation and triage, while laboratory test results should be applied when accuracy is the priority.


2020 ◽  
Author(s):  
Lucas M. Thimoteo ◽  
Marley M. Vellasco ◽  
Jorge M. do Amaral ◽  
Karla Figueiredo ◽  
Cátia Lie Yokoyama ◽  
...  

This work proposes an interpretable machine learning approach to diagnosesuspected COVID-19 cases based on clinical variables. Results obtained for the proposed models have F-2 measure superior to 0.80 and accuracy superior to 0.85. Interpretation of the linear model feature importance brought insights about the most relevant features. Shapley Additive Explanations were used in the non-linear models. They were able to show the difference between positive and negative patients as well as offer a global interpretability sense of the models.


Molecules ◽  
2020 ◽  
Vol 25 (20) ◽  
pp. 4696
Author(s):  
Ștefan-Mihai Petrea ◽  
Mioara Costache ◽  
Dragoș Cristea ◽  
Ștefan-Adrian Strungaru ◽  
Ira-Adeline Simionov ◽  
...  

Metals are considered to be one of the most hazardous substances due to their potential for accumulation, magnification, persistence, and wide distribution in water, sediments, and aquatic organisms. Demersal fish species, such as turbot (Psetta maxima maeotica), are accepted by the scientific communities as suitable bioindicators of heavy metal pollution in the aquatic environment. The present study uses a machine learning approach, which is based on multiple linear and non-linear models, in order to effectively estimate the concentrations of heavy metals in both turbot muscle and liver tissues. For multiple linear regression (MLR) models, the stepwise method was used, while non-linear models were developed by applying random forest (RF) algorithm. The models were based on data that were provided from scientific literature, attributed to 11 heavy metals (As, Ca, Cd, Cu, Fe, K, Mg, Mn, Na, Ni, Zn) from both muscle and liver tissues of turbot exemplars. Significant MLR models were recorded for Ca, Fe, Mg, and Na in muscle tissue and K, Cu, Zn, and Na in turbot liver tissue. The non-linear tree-based RF prediction models (over 70% prediction accuracy) were identified for As, Cd, Cu, K, Mg, and Zn in muscle tissue and As, Ca, Cd, Mg, and Fe in turbot liver tissue. Both machine learning MLR and non-linear tree-based RF prediction models were identified to be suitable for predicting the heavy metal concentration from both turbot muscle and liver tissues. The models can be used for improving the knowledge and economic efficiency of linked heavy metals food safety and environment pollution studies.


2021 ◽  
Author(s):  
Camilo Fernando Rodríguez-Genó ◽  
Léster Alfonso

Abstract. A parameterization for the collision-coalescence process is presented, based on the methodology of basis functions. The whole drop spectra is depicted as a linear combination of two lognormal distribution functions, in which all distribution parameters are formulated by means of six distribution moments included in a system of equations, thus eliminating the need of fixing any parameters. This basis functions parameterization avoids the classification of drops in artificial categories such as cloud water (cloud droplets) or rain water (raindrops). The total moment tendencies are calculated using a machine learning approach, in which one deep neural network was trained for each of the total moment orders involved. The neural networks were trained using randomly generated data following a uniform distribution, over a wide range of parameters employed by the parameterization. An analysis of the predicted total moment errors was performed, aimed to stablish the accuracy of the parameterization at reproducing the integrated distribution moments representative of physical variables. The applied machine learning approach shows a good accuracy level when compared to the output of an explicit collision-coalescence model.


Sign in / Sign up

Export Citation Format

Share Document