scholarly journals Antibacterial Activity Prediction Model of Traditional Chinese Medicine Based on Combined Data-Driven Approach and Machine Learning Algorithm: Constructed and Validated

2021 ◽  
Vol 12 ◽  
Author(s):  
Jin-Tong Li ◽  
Ya-Wen Wei ◽  
Meng-Yu Wang ◽  
Chun-Xiao Yan ◽  
Xia Ren ◽  
...  

Traditional Chinese medicines (TCMs), as a unique natural medicine resource, were used to prevent and treat bacterial diseases in China with a long history. To provide a prediction model of screening antibacterial TCMs for the design and discovery of novel antibacterial agents, the literature about antibacterial TCMs in the China National Knowledge Infrastructure (CNKI) and Web of Science database was retrieved. The data were extracted and standardized. A total of 28,786 pieces of data from 904 antibacterial TCMs were collected. The data of plant medicine were the most numerous. The result of association rules mining showed a high correlation between antibacterial activity with cold nature, bitter and sour tastes, hemostatic, and purging fire efficacies. Moreover, TCMs with antibacterial activity showed a specific aggregation in the phylogenetic tree; 92% of them came from Tracheophyta, of which 74% were mainly concentrated in rosids, asterids, Liliopsida, and Ranunculales. The prediction models of anti-Escherichia coli and anti-Staphylococcus aureus activity, with AUC values (the area under the ROC curve) of 77.5 and 80.0%, respectively, were constructed by the Neural Networks (NN) algorithm after Bagged Classification and Regression Tree (Bagged CART) and Linear Discriminant Analysis (LDA) selection. The in vitro experimental results showed the prediction accuracy of these two models was 75 and 60%, respectively. Four TCMs (Cirsii Japonici Herba Carbonisata, Changii Radix, Swertiae Herba, Callicarpae Formosanae Folium) were proposed for the first time to show antibacterial activity against E. coli and/or S. aureus. The results implied that the prediction model of antibacterial activity of TCMs based on properties and families showed certain prediction ability, which was of great significance to the screening of antibacterial TCMs and can be used to discover novel antibacterial agents.

2021 ◽  
Vol 11 (1) ◽  
Author(s):  
Serena Cabaro ◽  
Vittoria D’Esposito ◽  
Tiziana Di Matola ◽  
Silvia Sale ◽  
Michele Cennamo ◽  
...  

AbstractIn Europe, multiple waves of infections with SARS-CoV-2 (COVID-19) have been observed. Here, we have investigated whether common patterns of cytokines could be detected in individuals with mild and severe forms of COVID-19 in two pandemic waves, and whether machine learning approach could be useful to identify the best predictors. An increasing trend of multiple cytokines was observed in patients with mild or severe/critical symptoms of COVID-19, compared with healthy volunteers. Linear Discriminant Analysis (LDA) clearly recognized the three groups based on cytokine patterns. Classification and Regression Tree (CART) further indicated that IL-6 discriminated controls and COVID-19 patients, whilst IL-8 defined disease severity. During the second wave of pandemics, a less intense cytokine storm was observed, as compared with the first. IL-6 was the most robust predictor of infection and discriminated moderate COVID-19 patients from healthy controls, regardless of epidemic peak curve. Thus, serum cytokine patterns provide biomarkers useful for COVID-19 diagnosis and prognosis. Further definition of individual cytokines may allow to envision novel therapeutic options and pave the way to set up innovative diagnostic tools.


Energies ◽  
2019 ◽  
Vol 12 (24) ◽  
pp. 4786
Author(s):  
Yanpeng Hao ◽  
Zhaohong Yao ◽  
Junke Wang ◽  
Hao Li ◽  
Ruihai Li ◽  
...  

Icing forecasting for transmission lines is of great significance for anti-icing strategies in power grids, but existing prediction models have some disadvantages such as application limitations, weak generalization, and lack of global prediction ability. To overcome these shortcomings, this paper suggests a new conception about a segmental icing prediction model for transmission lines in which the classification of icing process plays a crucial role. In order to obtain the classification, a hierarchical K-means clustering method is utilized and 11 characteristic parameters are proposed. Based on this method, 97 icing processes derived from the Icing Monitoring System in China Southern Power Grid are clustered into six categories according to their curve shape and the abstracted icing evolution curves are drawn based on the clustering centroid. Results show that the processes of ice events are probably different and the icing process can be considered as a combination of several segments and nodes, which reinforce the suggested conception of the segmental icing prediction model. Based on monitoring data and clustering, the obtained types of icing evolution are more comprehensive and specific, and the work lays the foundation for the model construction and contributes to other fields.


2020 ◽  
Vol 13 (12) ◽  
pp. 431
Author(s):  
Beatriz Suay-Garcia ◽  
Antonio Falcó ◽  
J. Ignacio Bueso-Bordils ◽  
Gerardo M. Anton-Fos ◽  
M. Teresa Pérez-Gracia ◽  
...  

Drug repurposing appears as an increasing popular tool in the search of new treatment options against bacteria. In this paper, a tree-based classification method using Linear Discriminant Analysis (LDA) and discrete indexes was used to create a QSAR (Quantitative Structure-Activity Relationship) model to predict antibacterial activity against Escherichia coli. The model consists on a hierarchical decision tree in which a discrete index is used to divide compounds into groups according to their values for said index in order to construct probability spaces. The second step consists in the calculation of a discriminant function which determines the prediction of the model. The model was used to screen the DrugBank database, identifying 134 drugs as possible antibacterial candidates. Out of these 134 drugs, 8 were antibacterial drugs, 67 were drugs approved for different pathologies and 55 were drugs in experimental stages. This methodology has proven to be a viable alternative to the traditional methods used to obtain prediction models based on LDA and its application provides interesting new drug candidates to be studied as repurposed antibacterial treatments. Furthermore, the topological indexes Nclass and Numhba have proven to have the ability to group active compounds effectively, which suggests a close relationship between them and the antibacterial activity of compounds against E. coli.


2020 ◽  
Vol 74 ◽  
pp. 05024
Author(s):  
Lucia Svabova ◽  
Lucia Michalkova

The creation of prediction models to reveal the threat of financial difficulties of the companies is realized by the application of various multivariate statistical methods. From a global perspective, prediction models serve to classify a company into a group of prosperous or non-prosperous companies, or to quantify the probability of financial difficulties in the company. In many countries around the world, real financial data about the companies are used in developing these prediction models. In Slovakia, standard data from the financial statements and annual reports of Slovak companies are used for the creation of the company’s failure model. Since in this case there are generally large data files, it is necessary to pre-process the data by the selected methods before the prediction model is constructed. A database of the companies needs to be prepared for the subsequent application of statistical methods, and it is also highly appropriate to focus globally on the detection of potential extreme and remote observations. Therefore, the article will focus on quantifying the impact of the data structure detected, for example, the occurrence of extreme and remote observations in the data set, on the resulting overall classification of the prediction ability of the models created.


2021 ◽  
Vol 1 (4) ◽  
pp. 268-280
Author(s):  
Bamanga Mahmud , , , Ahmad ◽  
Ahmadu Asabe Sandra ◽  
Musa Yusuf Malgwi ◽  
Dahiru I. Sajoh

For the identification and prediction of different diseases, machine learning techniques are commonly used in clinical decision support systems. Since heart disease is the leading cause of death for both men and women around the world. Heart is one of the essential parts of human body, therefore, it is one of the most critical concerns in the medical domain, and several researchers have developed intelligent medical devices to support the systems and further to enhance the ability to diagnose and predict heart diseases. However, there are few studies that look at the capabilities of ensemble methods in developing a heart disease detection and prediction model. In this study, the researchers assessed that how to use ensemble model, which proposes a more stable performance than the use of base learning algorithm and these leads to better results than other heart disease prediction models. The University of California, Irvine (UCI) Machine Learning Repository archive was used to extract patient heart disease data records. To achieve the aim of this study, the researcher developed the meta-algorithm. The ensemble model is a superior solution in terms of high predictive accuracy and diagnostics output reliability, as per the results of the experiments. An ensemble heart disease prediction model is also presented in this work as a valuable, cost-effective, and timely predictive option with a user-friendly graphical user interface that is scalable and expandable. From the finding, the researcher suggests that Bagging is the best ensemble classifier to be adopted as the extended algorithm that has the high prediction probability score in the implementation of heart disease prediction.


2020 ◽  
pp. 002029402098140
Author(s):  
Jiale Ding ◽  
Guochu Chen ◽  
Yongmin Huang ◽  
Zhiquan Zhu ◽  
Kuo Yuan ◽  
...  

In this paper, a short-term wind speed prediction model, called CEEMDAN-SE-Improved PIO-GRNN, is proposed to optimize the accuracy of the short-term wind speed forecast. This model is established on account of the optimized General Regression Neural Network (GRNN) method optimized by three algorithms, which are Complete Ensemble Empirical Mode Decomposition with Adaptive Noise (CEEMDAN), Sample Entropy (SE), and Pigeon Inspired Optimization (PIO), separately. Firstly, decomposing the original wind speed sequences into several subsequences with different complexity by CEEMDAN. Then, the complexity of each subsequence is judged by SE and the similar subsequences are combined into a new sequence to reduce the scale of calculation. Afterwards, the GRNN model optimized by improved PIO is used to predict each new sequence. Finally, the predicted results are superposed as the eventual predicted value. Implementing the prediction for the wind speed data of a wind field in north China within 30 days by applying the different prediction models, namely, GRNN, CEEMDAN-GRNN, Improved PIO-GRNN, and CEEMDAN-SE-Improved PIO-GRNN which are proposed in this paper. Comparing the prediction curves of different models with the fitting curve of the actual wind speed shows that the optimal fitting effect and minimum error value are included in CEEMDAN-SE-Improved PIO-GRNN model. Specifically, the values of mean squared error (MSE), mean absolute error (MAE) and weighted mean absolute percentage error (WMAPE) separately decrease by 0.6222, 0.3334, and 8.5766%, which compare with the single prediction model GRNN. Meanwhile, diebold-mariano (DM) test shows that the prediction ability of the two models is significantly different. The above statements indicate the proposed model does great advance in the precision of short-term wind speed prediction.


Equilibrium ◽  
2017 ◽  
Vol 12 (4) ◽  
pp. 775-791 ◽  
Author(s):  
Maria Kovacova ◽  
Tomas Kliestik

Research background: Prediction of bankruptcy is an issue of interest of various researchers and practitioners since the first study dedicated to this topic was published in 1932. Finding the suitable bankruptcy prediction model is the task for economists and analysts from all over the world. forecasting model using. Despite a large number of various models, which have been created by using different methods with the aim to achieve the best results, it is still challenging to predict bankruptcy risk, as corporations have become more global and more complex. Purpose of the article: The aim of the presented study is to construct, via an empirical study of relevant literature and application of suitable chosen mathematical statistical methods, models for bankruptcy prediction of Slovak companies and provide the comparison of overall prediction ability of the two developed models. Methods: The research was conducted on the data set of Slovak corporations covering the period of the year 2015, and two mathematical statistical methods were applied. The methods are logit and probit, which are both symmetric binary choice models, also known as conditional probability models. On the other hand, these methods show some significant differences in process of model formation, as well as in achieved results. Findings & Value added: Given the fact that mostly discriminant analysis and logistic regression are used for the construction of bankruptcy prediction models, we have focused our attention on the development bankruptcy prediction model in the Slovak Republic via logistic regression and probit. The results of the study suggest that the model based on a logit functions slightly outperforms the classification accuracy of probit model. Differences were obtained also in the detection of the most significant predictors of bankruptcy prediction in these types of models constructed in Slovak companies.


2021 ◽  
Author(s):  
Serena Cabaro ◽  
Vittoria D'Esposito ◽  
Tiziana Di Matola ◽  
Silvia Sale ◽  
Michele Cennamo ◽  
...  

Abstract In Europe, two waves of infections with SARS-CoV-2 (COVID-19) have been observed to date. Here, we have investigated whether common patterns of cytokines could be detected in individuals with mild and severe forms of COVID-19 in the two pandemic waves, and whether machine learning approach could be useful to identify the best predictors. An increasing trend of multiple cytokines was observed in patients with mild or severe/critical symptoms of COVID-19, compared with healthy volunteers. Linear Discriminant Analysis (LDA) clearly recognized the three groups based on cytokine patterns. Classification and Regression Tree (CART) further indicated that IL-6 discriminated controls and COVID-19 patients, whilst IL-8 defined disease severity. During the second wave of pandemics, a less intense cytokine storm was observed. CART analysis revealed that IL-6 was the most robust predictor of infection and discriminated moderate COVID-19 patients from healthy controls, regardless of epidemic peak curve. Thus, serum cytokine patterns provide non-invasive biomarkers useful for COVID-19 diagnosis and prognosis. Further definition of individual cytokines may allow to envision novel therapeutic options and pave the way to set up innovative diagnostic tools.


Author(s):  
Ram B. Gurung ◽  
Tony Lindgren ◽  
Henrik Bostr¨om

Being able to accurately predict the impending failures of truck components is often associated with significant amount of cost savings, customer satisfaction and flexibility in maintenanceservice plans. However, because of the diversity in the way trucks typically are configured and their usage under different conditions, the creation of accurate prediction models is not an easy task. This paper describes an effort in creating such a prediction model for the NOx sensor, i.e., a component measuring the emitted level of nitrogen oxide in the exhaust of the engine. This component was chosen because it is vital for the truck to function properly, while at the same time being very fragile and costly to repair. As input to the model, technical specifications of trucks and their operational data are used. The process of collecting the data and making it ready for training the model via a slightly modified Random Forest learning algorithm is described along with various challenges encountered during this process. The operational data consists of features represented as histograms, posing an additional challenge for the data analysis task. In the study, a modified version of the random forest algorithm is employed, which exploits the fact that the individual bins in the histograms are related, in contrast to the standard approach that would consider the bins as independent features. Experiments are conducted using the updated random forest algorithm, and they clearly show that the modified version is indeed beneficial when compared to the standard random forest algorithm. The performance of the resulting prediction model for the NOx sensor is promising and may be adopted for the benefit of operators of heavy trucks.


Sign in / Sign up

Export Citation Format

Share Document