Integration of Machine Learning and Computational Fluid Dynamics to Develop Turbulence Models for Improved Turbine Wake Mixing Prediction

Abstract This paper presents development of accurate turbulence closures for wake mixing prediction by integrating a machine-learning approach with Reynolds Averaged Navier-Stokes (RANS)-based computational fluid dynamics (CFD). The data-driven modeling framework is based on the gene expression programming (GEP) approach previously shown to generate non-linear RANS models with good accuracy. To further improve the performance and robustness of the data-driven closures, here we exploit that GEP produces tangible models to integrate RANS in the closure development process. Specifically, rather than using as cost function a comparison of the GEP-based closure terms with a frozen high-fidelity dataset, each GEP model is instead automatically implemented into a RANS solver and the subsequent calculation results compared with reference data. By first using a canonical turbine wake with inlet conditions prescribed based on high-fidelity data, we demonstrate that the CFD-driven machine-learning approach produces non-linear turbulence closures that are physically correct, i.e. predict the right downstream wake development and maintain an accurate peak wake loss throughout the domain. We then extend our analysis to full turbine-blade cases and show that the model development is sensitive to the training region due to the presence of deterministic unsteadiness in the near wake region. Models developed including this region have artificially large diffusion coefficients to over-compensate for the vortex shedding steady RANS cannot capture. In contrast, excluding the near wake region in the model development produces the correct physical model behavior, but predictive accuracy in the near-wake remains unsatisfactory. We show that this can be remedied by using the physically consistent models in unsteady RANS, implying that the non-linear closure producing the best predictive accuracy depends on whether it will be deployed in RANS or unsteady RANS calculations. Overall, the models developed with the CFD-assisted machine learning approach were found to be robust and capture the correct physical behavior across different operating conditions.

Download Full-text

INTEGRATION OF MACHINE LEARNING AND COMPUTATIONAL FLUID DYNAMICS TO DEVELOP TURBULENCE MODELS FOR IMPROVED LOW-PRESSURE TURBINE WAKE MIXING PREDICTION

Journal of Turbomachinery ◽

10.1115/1.4051417 ◽

2021 ◽

pp. 1-12

Author(s):

Harshal D Akolekar ◽

Yaomin Zhao ◽

Richard Sandberg ◽

Roberto Pacciani

Keyword(s):

Machine Learning ◽

Fluid Dynamics ◽

Computational Fluid Dynamics ◽

Model Development ◽

Learning Approach ◽

Near Wake ◽

Low Pressure Turbine ◽

Machine Learning Approach ◽

Turbulence Closures ◽

Turbine Wake

Abstract This paper presents the development of accurate turbulence closures for low-pressure turbine (LPT) wake mixing prediction by integrating a machine-learning approach based on gene expression programming (GEP), with Reynolds Averaged Navier-Stokes (RANS) based computational fluid dynamics (CFD). In order to further improve the performance and robustness of GEP-based data-driven closures, the fitness of models is evaluated by running RANS calculations in an integrated way, instead of an algebraic function. Using a canonical turbine wake with inlet conditions prescribed based on high-fidelity data of the T106A cascade, we demonstrate that the ‘CFD-driven’ machine-learning approach produces physically correct non-linear turbulence closures, i.e., predict the right down-stream wake development and maintain an accurate peak wake loss throughout the domain. We then extend our analysis to full turbine blade cases and show that the model development is sensitive to the training region due to the presence of deterministic unsteadiness in the near-wake. Models developed including the near-wake have artificially large diffusion coefficients to over-compensate for the vortex shedding steady RANS cannot capture. In contrast, excluding the near-wake in the model development produces the correct physical model behavior, but predictive accuracy in the near-wake remains unsatisfactory. This can be remedied by using the physically consistent models in unsteady RANS. Overall, the ‘CFD-driven’ models were found to be robust and capture the correct physical wake mixing behavior across different LPT operating conditions and airfoils such as T106C and PakB.

Download Full-text

Application of data driven machine learning approach for modelling of non-linear filtration through granular porous media

International Journal of Heat and Mass Transfer ◽

10.1016/j.ijheatmasstransfer.2021.121650 ◽

2021 ◽

Vol 179 ◽

pp. 121650

Author(s):

Ashes Banerjee ◽

Srinivas Pasupuleti ◽

Koushik Mondal ◽

M. Mousavi Nezhad

Keyword(s):

Machine Learning ◽

Porous Media ◽

Data Driven ◽

Learning Approach ◽

Machine Learning Approach ◽

Non Linear ◽

Granular Porous Media ◽

Linear Filtration

Download Full-text

A supervised machine learning approach for taxonomic relation recognition through non-linear enumerative structures

Proceedings of the 30th Annual ACM Symposium on Applied Computing - SAC '15 ◽

10.1145/2695664.2695988 ◽

2015 ◽

Author(s):

Jean-Philippe Fauconnier ◽

Mouna Kamel ◽

Bernard Rothenburger

Keyword(s):

Machine Learning ◽

Supervised Machine Learning ◽

Learning Approach ◽

Taxonomic Relation ◽

Machine Learning Approach ◽

Non Linear

Download Full-text

Comparative analysis of a deep learning approach with various classification techniques for credit score computation

Recent Advances in Computer Science and Communications ◽

10.2174/2666255813999200721004720 ◽

2020 ◽

Vol 13 ◽

Author(s):

Arvind Pandey ◽

Shipra Shukla ◽

Krishna Kumar Mohbey

Keyword(s):

Machine Learning ◽

Deep Learning ◽

Comparative Analysis ◽

Predictive Accuracy ◽

Credit Scoring ◽

Gradient Boosting ◽

Learning Approach ◽

Probit Regression ◽

Credit Score ◽

Machine Learning Approach

Background: Large financial companies are perpetually creating and updating customer scoring techniques. From a risk management view, this research for the predictive accuracy of probability is of vital importance than the traditional binary result of classification, i.e., non-credible and credible customers. The customer's default payment in Taiwan is explored for the case study. Objective: The aim is to audit the comparison between the predictive accuracy of the probability of default with various techniques of statistics and machine learning. Method: In this paper, nine predictive models are compared from which the results of the six models are taken into consideration. Deep learning-based H2O, XGBoost, logistic regression, gradient boosting, naïve Bayes, logit model, and probit regression comparative analysis is performed. The software tools such as R and SAS (university edition) is employed for machine learning and statistical model evaluation. Results: Through the experimental study, we demonstrate that XGBoost performs better than other AI and ML algorithms. Conclusion: Machine learning approach such as XGBoost effectively used for credit scoring, among other data mining and statistical approaches.

Download Full-text

Pipe thinning model development for direct current potential drop data with machine learning approach

Nuclear Engineering and Technology ◽

10.1016/j.net.2019.10.004 ◽

2020 ◽

Vol 52 (4) ◽

pp. 784-790 ◽

Cited By ~ 1

Author(s):

Kyungha Ryu ◽

Taehyun Lee ◽

Dong-cheon Baek ◽

Jong-won Park

Keyword(s):

Machine Learning ◽

Direct Current ◽

Model Development ◽

Potential Drop ◽

Learning Approach ◽

Machine Learning Approach ◽

Direct Current Potential Drop ◽

Current Potential

Download Full-text

A Multimodality Machine Learning Approach to Differentiate Severe and Nonsevere COVID-19: Model Development and Validation

Journal of Medical Internet Research ◽

10.2196/23948 ◽

2021 ◽

Vol 23 (4) ◽

pp. e23948

Author(s):

Yuanfang Chen ◽

Liu Ouyang ◽

Forrest S Bao ◽

Qian Li ◽

Lei Han ◽

...

Keyword(s):

Machine Learning ◽

Random Forest ◽

Laboratory Test ◽

Predictive Accuracy ◽

Learning Approach ◽

Test Results ◽

Laboratory Test Results ◽

Machine Learning Approach ◽

Forest Models ◽

Random Forest Models

Background Effectively and efficiently diagnosing patients who have COVID-19 with the accurate clinical type of the disease is essential to achieve optimal outcomes for the patients as well as to reduce the risk of overloading the health care system. Currently, severe and nonsevere COVID-19 types are differentiated by only a few features, which do not comprehensively characterize the complicated pathological, physiological, and immunological responses to SARS-CoV-2 infection in the different disease types. In addition, these type-defining features may not be readily testable at the time of diagnosis. Objective In this study, we aimed to use a machine learning approach to understand COVID-19 more comprehensively, accurately differentiate severe and nonsevere COVID-19 clinical types based on multiple medical features, and provide reliable predictions of the clinical type of the disease. Methods For this study, we recruited 214 confirmed patients with nonsevere COVID-19 and 148 patients with severe COVID-19. The clinical characteristics (26 features) and laboratory test results (26 features) upon admission were acquired as two input modalities. Exploratory analyses demonstrated that these features differed substantially between two clinical types. Machine learning random forest models based on all the features in each modality as well as on the top 5 features in each modality combined were developed and validated to differentiate COVID-19 clinical types. Results Using clinical and laboratory results independently as input, the random forest models achieved >90% and >95% predictive accuracy, respectively. The importance scores of the input features were further evaluated, and the top 5 features from each modality were identified (age, hypertension, cardiovascular disease, gender, and diabetes for the clinical features modality, and dimerized plasmin fragment D, high sensitivity troponin I, absolute neutrophil count, interleukin 6, and lactate dehydrogenase for the laboratory testing modality, in descending order). Using these top 10 multimodal features as the only input instead of all 52 features combined, the random forest model was able to achieve 97% predictive accuracy. Conclusions Our findings shed light on how the human body reacts to SARS-CoV-2 infection as a unit and provide insights on effectively evaluating the disease severity of patients with COVID-19 based on more common medical features when gold standard features are not available. We suggest that clinical information can be used as an initial screening tool for self-evaluation and triage, while laboratory test results should be applied when accuracy is the priority.

Download Full-text

Interpretable Machine Learning for COVID-19 Diagnosis Through Clinical Variables

10.48011/asba.v2i1.1590 ◽

2020 ◽

Author(s):

Lucas M. Thimoteo ◽

Marley M. Vellasco ◽

Jorge M. do Amaral ◽

Karla Figueiredo ◽

Cátia Lie Yokoyama ◽

...

Keyword(s):

Machine Learning ◽

Linear Model ◽

Linear Models ◽

Learning Approach ◽

Clinical Variables ◽

Interpretable Machine Learning ◽

Machine Learning Approach ◽

Feature Importance ◽

Non Linear ◽

The Difference

This work proposes an interpretable machine learning approach to diagnosesuspected COVID-19 cases based on clinical variables. Results obtained for the proposed models have F-2 measure superior to 0.80 and accuracy superior to 0.85. Interpretation of the linear model feature importance brought insights about the most relevant features. Shapley Additive Explanations were used in the non-linear models. They were able to show the difference between positive and negative patients as well as offer a global interpretability sense of the models.

Download Full-text

A Machine Learning Approach in Analyzing Bioaccumulation of Heavy Metals in Turbot Tissues

Molecules ◽

10.3390/molecules25204696 ◽

2020 ◽

Vol 25 (20) ◽

pp. 4696

Author(s):

Ștefan-Mihai Petrea ◽

Mioara Costache ◽

Dragoș Cristea ◽

Ștefan-Adrian Strungaru ◽

Ira-Adeline Simionov ◽

...

Keyword(s):

Machine Learning ◽

Heavy Metals ◽

Heavy Metal ◽

Liver Tissue ◽

Linear Models ◽

Prediction Models ◽

Learning Approach ◽

Machine Learning Approach ◽

Non Linear ◽

Liver Tissues

Metals are considered to be one of the most hazardous substances due to their potential for accumulation, magnification, persistence, and wide distribution in water, sediments, and aquatic organisms. Demersal fish species, such as turbot (Psetta maxima maeotica), are accepted by the scientific communities as suitable bioindicators of heavy metal pollution in the aquatic environment. The present study uses a machine learning approach, which is based on multiple linear and non-linear models, in order to effectively estimate the concentrations of heavy metals in both turbot muscle and liver tissues. For multiple linear regression (MLR) models, the stepwise method was used, while non-linear models were developed by applying random forest (RF) algorithm. The models were based on data that were provided from scientific literature, attributed to 11 heavy metals (As, Ca, Cd, Cu, Fe, K, Mg, Mn, Na, Ni, Zn) from both muscle and liver tissues of turbot exemplars. Significant MLR models were recorded for Ca, Fe, Mg, and Na in muscle tissue and K, Cu, Zn, and Na in turbot liver tissue. The non-linear tree-based RF prediction models (over 70% prediction accuracy) were identified for As, Cd, Cu, K, Mg, and Zn in muscle tissue and As, Ca, Cd, Mg, and Fe in turbot liver tissue. Both machine learning MLR and non-linear tree-based RF prediction models were identified to be suitable for predicting the heavy metal concentration from both turbot muscle and liver tissues. The models can be used for improving the knowledge and economic efficiency of linked heavy metals food safety and environment pollution studies.

Download Full-text

Parameterization of the collision-coalescence process using series of basis functions: COLNETv1.0.0 model development using a machine learning approach

10.5194/gmd-2021-125 ◽

2021 ◽

Author(s):

Camilo Fernando Rodríguez-Genó ◽

Léster Alfonso

Keyword(s):

Machine Learning ◽

Model Development ◽

Distribution Functions ◽

Basis Functions ◽

Learning Approach ◽

Coalescence Process ◽

Total Moment ◽

Wide Range ◽

Machine Learning Approach ◽

Distribution Moments

Abstract. A parameterization for the collision-coalescence process is presented, based on the methodology of basis functions. The whole drop spectra is depicted as a linear combination of two lognormal distribution functions, in which all distribution parameters are formulated by means of six distribution moments included in a system of equations, thus eliminating the need of fixing any parameters. This basis functions parameterization avoids the classification of drops in artificial categories such as cloud water (cloud droplets) or rain water (raindrops). The total moment tendencies are calculated using a machine learning approach, in which one deep neural network was trained for each of the total moment orders involved. The neural networks were trained using randomly generated data following a uniform distribution, over a wide range of parameters employed by the parameterization. An analysis of the predicted total moment errors was performed, aimed to stablish the accuracy of the parameterization at reproducing the integrated distribution moments representative of physical variables. The applied machine learning approach shows a good accuracy level when compared to the output of an explicit collision-coalescence model.

Download Full-text