scholarly journals The site‐specific selection of the infiltration model based on the global dataset and random forest algorithm

2021 ◽  
Author(s):  
Seongyun Kim ◽  
Gülay Karahan ◽  
Manan Sharma ◽  
Yakov Pachepsky
2020 ◽  
Vol 221 (Supplement_2) ◽  
pp. S263-S271 ◽  
Author(s):  
Peng Lan ◽  
Qiucheng Shi ◽  
Ping Zhang ◽  
Yan Chen ◽  
Rushuang Yan ◽  
...  

Abstract Background Hypervirulent Klebsiella pneumoniae (hvKP) infections can have high morbidity and mortality rates owing to their invasiveness and virulence. However, there are no effective tools or biomarkers to discriminate between hvKP and nonhypervirulent K. pneumoniae (nhvKP) strains. We aimed to use a random forest algorithm to predict hvKP based on core-genome data. Methods In total, 272 K. pneumoniae strains were collected from 20 tertiary hospitals in China and divided into hvKP and nhvKP groups according to clinical criteria. Clinical data comparisons, whole-genome sequencing, virulence profile analysis, and core genome multilocus sequence typing (cgMLST) were performed. We then established a random forest predictive model based on the cgMLST scheme to prospectively identify hvKP. The random forest is an ensemble learning method that generates multiple decision trees during the training process and each decision tree will output its own prediction results corresponding to the input. The predictive ability of the model was assessed by means of area under the receiver operating characteristic curve. Results Patients in the hvKP group were younger than those in the nhvKP group (median age, 58.0 and 68.0 years, respectively; P < .001). More patients in the hvKP group had underlying diabetes mellitus (43.1% vs 20.1%; P < .001). Clinically, carbapenem-resistant K. pneumoniae was less common in the hvKP group (4.1% vs 63.8%; P < .001), whereas the K1/K2 serotype, sequence type (ST) 23, and positive string tests were significantly higher in the hvKP group. A cgMLST-based minimal spanning tree revealed that hvKP strains were scattered sporadically within nhvKP clusters. ST23 showed greater genome diversification than did ST11, according to cgMLST-based allelic differences. Primary virulence factors (rmpA, iucA, positive string test result, and the presence of virulence plasmid pLVPK) were poor predictors of the hypervirulence phenotype. The random forest model based on the core genome allelic profile presented excellent predictive power, both in the training and validating sets (area under receiver operating characteristic curve, 0.987 and 0.999 in the training and validating sets, respectively). Conclusions A random forest algorithm predictive model based on the core genome allelic profiles of K. pneumoniae was accurate to identify the hypervirulent isolates.


2014 ◽  
Vol 20 (5) ◽  
Author(s):  
P. Dohnalek ◽  
M. Dvorsky ◽  
P. Gajdos ◽  
L. Michalek ◽  
R. Sebesta ◽  
...  

Machines ◽  
2019 ◽  
Vol 7 (4) ◽  
pp. 69 ◽  
Author(s):  
Gino Iannace ◽  
Giuseppe Ciaburro ◽  
Amelia Trematerra

Wind energy is one of the most widely used renewable energy sources in the world and has grown rapidly in recent years. However, the wind towers generate a noise that is perceived as an annoyance by the population living near the wind farms. It is therefore important to new tools that can help wind farm builders and the administrations. In this study, the measurements of the noise emitted by a wind farm and the data recorded by the supervisory control and data acquisition (SCADA) system were used to construct a prediction model. First, acoustic measurements and control system data have been analyzed to characterize the phenomenon. An appropriate number of observations were then extracted, and these data were pre-processed. Subsequently two models of prediction of sound pressure levels were built at the receiver: a model based on multiple linear regression, and a model based on Random Forest algorithm. As predictors wind speeds measured near the wind turbines and the active power of the turbines were selected. Both data were measured by the SCADA system of wind turbines. The model based on the Random Forest algorithm showed high values of the Pearson correlation coefficient (0.981), indicating a high number of correct predictions. This model can be extremely useful, both for the receiver and for the wind farm manager. Through the results of the model it will be possible to establish for which wind speed values the noise produced by wind turbines become dominant. Furthermore, the predictive model can give an overview of the noise produced by the receiver from the system in different operating conditions. Finally, the prediction model does not require the shutdown of the plant, a very expensive procedure due to the consequent loss of production.


Sign in / Sign up

Export Citation Format

Share Document