Kernel-Based Machine Learning Models to Predict Mitigation Time During Cloud Security Attacks

2021 ◽  
Vol 17 (4) ◽  
pp. 75-88
Author(s):  
Padmaja Kadiri ◽  
Seshadri Ravala

Security threats are unforeseen attacks to the services provided by the cloud service provider. Depending on the type of attack, the cloud service and its associated features will be unavailable. The mitigation time is an integral part of attack recovery. This research paper explores the different parameters that will aid in predicting the mitigation time after an attack on cloud services. Further, the paper presents machine learning models that can predict the mitigation time. The paper presents the kernel-based machine learning models that can predict the average mitigation time during security attacks. The analysis of the results shows that the kernel-based models show 87% accuracy in predicting the mitigation time. Furthermore, the paper explores the performance of the kernel-based machine learning models based on the regression-based predictive models. The regression model is used as a benchmark model to analyze the performance of the machine learning-based predictive models in the prediction of mitigation time in the wake of an attack.

2021 ◽  
Author(s):  
Norberto Sánchez-Cruz ◽  
Jose L. Medina-Franco

<p>Epigenetic targets are a significant focus for drug discovery research, as demonstrated by the eight approved epigenetic drugs for treatment of cancer and the increasing availability of chemogenomic data related to epigenetics. This data represents a large amount of structure-activity relationships that has not been exploited thus far for the development of predictive models to support medicinal chemistry efforts. Herein, we report the first large-scale study of 26318 compounds with a quantitative measure of biological activity for 55 protein targets with epigenetic activity. Through a systematic comparison of machine learning models trained on molecular fingerprints of different design, we built predictive models with high accuracy for the epigenetic target profiling of small molecules. The models were thoroughly validated showing mean precisions up to 0.952 for the epigenetic target prediction task. Our results indicate that the herein reported models have considerable potential to identify small molecules with epigenetic activity. Therefore, our results were implemented as freely accessible and easy-to-use web application.</p>


2021 ◽  
Vol 36 (Supplement_1) ◽  
Author(s):  
J A Ortiz ◽  
R Morales ◽  
B Lledo ◽  
E Garcia-Hernandez ◽  
A Cascales ◽  
...  

Abstract Study question Is it possible to predict the likelihood of an IVF embryo being aneuploid and/or mosaic using a machine learning algorithm? Summary answer There are paternal, maternal, embryonic and IVF-cycle factors that are associated with embryonic chromosomal status that can be used as predictors in machine learning models. What is known already The factors associated with embryonic aneuploidy have been extensively studied. Mostly maternal age and to a lesser extent male factor and ovarian stimulation have been related to the occurrence of chromosomal alterations in the embryo. On the other hand, the main factors that may increase the incidence of embryo mosaicism have not yet been established. The models obtained using classical statistical methods to predict embryonic aneuploidy and mosaicism are not of high reliability. As an alternative to traditional methods, different machine and deep learning algorithms are being used to generate predictive models in different areas of medicine, including human reproduction. Study design, size, duration The study design is observational and retrospective. A total of 4654 embryos from 1558 PGT-A cycles were included (January-2017 to December-2020). The trophoectoderm biopsies on D5, D6 or D7 blastocysts were analysed by NGS. Embryos with ≤25% aneuploid cells were considered euploid, between 25-50% were classified as mosaic and aneuploid with &gt;50%. The variables of the PGT-A were recorded in a database from which predictive models of embryonic aneuploidy and mosaicism were developed. Participants/materials, setting, methods The main indications for PGT-A were advanced maternal age, abnormal sperm FISH and recurrent miscarriage or implantation failure. Embryo analysis were performed using Veriseq-NGS (Illumina). The software used to carry out all the analysis was R (RStudio). The library used to implement the different algorithms was caret. In the machine learning models, 22 predictor variables were introduced, which can be classified into 4 categories: maternal, paternal, embryonic and those specific to the IVF cycle. Main results and the role of chance The different couple, embryo and stimulation cycle variables were recorded in a database (22 predictor variables). Two different predictive models were performed, one for aneuploidy and the other for mosaicism. The predictor variable was of multi-class type since it included the segmental and whole chromosome alteration categories. The dataframe were first preprocessed and the different classes to be predicted were balanced. A 80% of the data were used for training the model and 20% were reserved for further testing. The classification algorithms applied include multinomial regression, neural networks, support vector machines, neighborhood-based methods, classification trees, gradient boosting, ensemble methods, Bayesian and discriminant analysis-based methods. The algorithms were optimized by minimizing the Log_Loss that measures accuracy but penalizing misclassifications. The best predictive models were achieved with the XG-Boost and random forest algorithms. The AUC of the predictive model for aneuploidy was 80.8% (Log_Loss 1.028) and for mosaicism 84.1% (Log_Loss: 0.929). The best predictor variables of the models were maternal age, embryo quality, day of biopsy and whether or not the couple had a history of pregnancies with chromosomopathies. The male factor only played a relevant role in the mosaicism model but not in the aneuploidy model. Limitations, reasons for caution Although the predictive models obtained can be very useful to know the probabilities of achieving euploid embryos in an IVF cycle, increasing the sample size and including additional variables could improve the models and thus increase their predictive capacity. Wider implications of the findings Machine learning can be a very useful tool in reproductive medicine since it can allow the determination of factors associated with embryonic aneuploidies and mosaicism in order to establish a predictive model for both. To identify couples at risk of embryo aneuploidy/mosaicism could benefit them of the use of PGT-A. Trial registration number Not Applicable


2021 ◽  
Author(s):  
Abderraouf Chemmakh ◽  
Ahmed Merzoug ◽  
Habib Ouadi ◽  
Abdelhak Ladmia ◽  
Vamegh Rasouli

Abstract One of the most critical parameters of the CO2 injection (for EOR purposes) is the Minimum Miscibility Pressure MMP. The determination of this parameter is crucial for the success of the operation. Different experimental, analytical, and statistical technics are used to predict the MMP. Nevertheless, experimental technics are costly and tedious, while correlations are used for specific reservoir conditions. Based on that, the purpose of this paper is to build machine learning models aiming to predict the MMP efficiently and in broad-based reservoir conditions. Two ML models are proposed for both pure CO2 and non-pure CO2 injection. An important amount of data collected from literature is used in this work. The ANN and SVR-GA models have shown enhanced performance comparing to existing correlations in literature for both the pure and non-pure models, with a coefficient of R2 0.98, 0.93 and 0.96, 0.93 respectively, which confirms that the proposed models are reliable and ready to use.


2021 ◽  
Vol 9 ◽  
Author(s):  
Daniel Lowell Weller ◽  
Tanzy M. T. Love ◽  
Martin Wiedmann

Recent studies have shown that predictive models can supplement or provide alternatives to E. coli-testing for assessing the potential presence of food safety hazards in water used for produce production. However, these studies used balanced training data and focused on enteric pathogens. As such, research is needed to determine 1) if predictive models can be used to assess Listeria contamination of agricultural water, and 2) how resampling (to deal with imbalanced data) affects performance of these models. To address these knowledge gaps, this study developed models that predict nonpathogenic Listeria spp. (excluding L. monocytogenes) and L. monocytogenes presence in agricultural water using various combinations of learner (e.g., random forest, regression), feature type, and resampling method (none, oversampling, SMOTE). Four feature types were used in model training: microbial, physicochemical, spatial, and weather. “Full models” were trained using all four feature types, while “nested models” used between one and three types. In total, 45 full (15 learners*3 resampling approaches) and 108 nested (5 learners*9 feature sets*3 resampling approaches) models were trained per outcome. Model performance was compared against baseline models where E. coli concentration was the sole predictor. Overall, the machine learning models outperformed the baseline E. coli models, with random forests outperforming models built using other learners (e.g., rule-based learners). Resampling produced more accurate models than not resampling, with SMOTE models outperforming, on average, oversampling models. Regardless of resampling method, spatial and physicochemical water quality features drove accurate predictions for the nonpathogenic Listeria spp. and L. monocytogenes models, respectively. Overall, these findings 1) illustrate the need for alternatives to existing E. coli-based monitoring programs for assessing agricultural water for the presence of potential food safety hazards, and 2) suggest that predictive models may be one such alternative. Moreover, these findings provide a conceptual framework for how such models can be developed in the future with the ultimate aim of developing models that can be integrated into on-farm risk management programs. For example, future studies should consider using random forest learners, SMOTE resampling, and spatial features to develop models to predict the presence of foodborne pathogens, such as L. monocytogenes, in agricultural water when the training data is imbalanced.


2022 ◽  
pp. 205-224
Author(s):  
Dhiviya Ram

One of the most unique forms of contracting is apparent in cloud computing. Cloud computing, unlike other conventional methods, has adopted a different approach in the formation of binding contract that will be used for the governance of the cloud. This method is namely the clickwrap agreement. Click wrap agreement follows a take it or leave it basis in which the end users are provided with limited to no option in terms of having a say on the contract that binds them during the use of cloud services. The terms found in the contract are often cloud service provider friendly and will be less favourable to the end user. In this article, the authors examine the terms that are often found in the cloud computing agreement as well as study the benefit that is entailed in adopting this contracting method. This chapter has undertaken a qualitative study that comprises interviews of cloud service providers in Malaysia. Hence, this study is a novel approach that also provides insight in terms of the cloud service provider perspective regarding the click wrap agreement.


2021 ◽  
Author(s):  
Maxime Langevin ◽  
Rodolphe Vuilleumier ◽  
Marc Bianciotto

Despite growing interest and success in automated in-silico molecular design, doubts remain regarding the ability of goal-directed generation algorithms to perform unbiased exploration of novel chemical spaces. A specific phenomenon has recently been highlighted: goal-directed generation guided with machine learning models produce molecules with high scores according to the optimization model, but low scores according to control models, even when trained on the same data distribution and the same target. In this work, we show that this worrisome behavior is actually due to issues with the predictive models and not the goal-directed generation algorithms. We show that with appropriate predictive models, this issue can be resolved, and molecules generated have high scores according to both the optimization and the control models.


2019 ◽  
Vol 8 (3) ◽  
pp. 6217-6225

Now-a-days the cloud is very useful for providing many IT services. These services are delivered over the internet and accessed globally with the help of internet. The cloud service provider ensures flexibility in provisioning and scaling of resources. The cloud services are completely managed by cloud service provider (CSP), which ensures the end to end availability, reliability and security of the cloud resources. The exponential growth of cloud services has provided many opportunities but has also perplexed severe security concerns. The popularity of cloud service based applications is rapidly increasing due to which many security and legal issues are arising. In this paper we describe the existing access control method and framework for securing cloud services. The concept of modified reputation and attribute based access control system has been discussed. In this approach the concept of crowd reviewing has been used to compute the credit value of users. The simulation experiment has been shown to protect the consistent users and to restrict the access of inconsistent users in cloud environment. It is an access control approach to mitigate the challenges in security concerns. This access control mechanism is helpful for cloud application services, which automatically restrict the malicious users from the access of resources. It is also helpful in authorization of users to access the cloud resources.


2020 ◽  
Vol 31 (4) ◽  
pp. 411-424
Author(s):  
Han Lai ◽  
Huchang Liao ◽  
Zhi Wen ◽  
Edmundas Kazimieras Zavadskas ◽  
Abdullah Al-Barakati

With the rapid growth of available online cloud services and providers for customers, the selection of cloud service providers plays a crucial role in on-demand service selection on a subscription basis. Selecting a suitable cloud service provider requires a careful analysis and a reasonable ranking method. In this study, an improved combined compromise solution (CoCoSo) method is proposed to identify the ranking of cloud service providers. Based on the original CoCoSo method, we analyze the defects of the final aggregation operator in the original CoCoSo method which ignores the equal importance of the three subordinate compromise scores, and employ the operator of “Linear Sum Normalization” to normalize the three subordinate compromise scores so as to make the results reasonable. In addition, we introduce a maximum variance optimization model which can increase the discrimination degree of evaluation results and avoid inconsistent ordering. A numerical example of the trust evaluation of cloud service providers is given to demonstrate the applicability of the proposed method. Furthermore, we perform sensitivity analysis and comparative analysis to justify the accuracy of the decision outcomes derived by the proposed method. Besides, the results of discrimination test also indicate that the proposed method is more effective than the original CoCoSo method in identifying the subtle differences among alternatives.


Author(s):  
Andy W. Chen

Background: Adverse drug reactions are a drug safety issue affecting more than two million people in the U.S. annually. The Food and Drug Administration (FDA) maintains a comprehensive database of adverse drug reactions reported known as FAERS (FDA adverse event reporting system), providing a valuable resource for studying factors associated with ADRs. The goal of the project is to build predictive models to predict the outcome given patient characteristics and drug usage. The results can be valuable for health care practitioners by offering new knowledge on adverse drug reactions which can be used to improve decision making related to drug prescriptions.Methods: In this paper I present and discuss results from machine learning models used to predict outcomes of ADRs. Machine learning models are a popular set of models for prediction. They have gained attention recently and have been used in a variety of fields. They can be trained on existing data and retrained when new data become available. The trained models are then used to make predictions.Results: I find that the supervised learning models are work similarly within groups, with accuracy between 65% and 75% for predicting deaths and 70% to 75% for predicting hospitalizations. Across groups the models predict hospitalizations better than deaths.Conclusions: The predictive models I built achieve good accuracy. The results can potentially be improved when more data become available in the future.


Author(s):  
Bohdan M. Pavlyshenko

In this paper, we study the usage of machine learning models for sales time series forecasting. The effect of machine learning generalization has been considered. A stacking approach for building regression ensemble of single models has been studied. The results show that using stacking technics, we can improve the performance of predictive models for sales time series forecasting.


Sign in / Sign up

Export Citation Format

Share Document