scholarly journals Proposed Improvements for Automated Chemical Safety Evaluations Using In-Silico Techniques

Author(s):  
Bryan Jordan

The vastness of chemical-space constrains traditional drug-discovery methods to the organic laws that are guiding the chemistry involved in filtering through candidates. Leveraging computing with machine-learning to intelligently generate compounds that meet a wide range of objectives can bring significant gains in time and effort needed to filter through a broad range of candidates. This paper details how the use of Generative-Adversarial-Networks, novel machine learning techniques to format the training dataset and the use of quantum computing offer new ways to expedite drug-discovery.

2021 ◽  
Vol 1 ◽  
Author(s):  
Attayeb Mohsen ◽  
Lokesh P. Tripathi ◽  
Kenji Mizuguchi

Machine learning techniques are being increasingly used in the analysis of clinical and omics data. This increase is primarily due to the advancements in Artificial intelligence (AI) and the build-up of health-related big data. In this paper we have aimed at estimating the likelihood of adverse drug reactions or events (ADRs) in the course of drug discovery using various machine learning methods. We have also described a novel machine learning-based framework for predicting the likelihood of ADRs. Our framework combines two distinct datasets, drug-induced gene expression profiles from Open TG–GATEs (Toxicogenomics Project–Genomics Assisted Toxicity Evaluation Systems) and ADR occurrence information from FAERS (FDA [Food and Drug Administration] Adverse Events Reporting System) database, and can be applied to many different ADRs. It incorporates data filtering and cleaning as well as feature selection and hyperparameters fine tuning. Using this framework with Deep Neural Networks (DNN), we built a total of 14 predictive models with a mean validation accuracy of 89.4%, indicating that our approach successfully and consistently predicted ADRs for a wide range of drugs. As case studies, we have investigated the performances of our prediction models in the context of Duodenal ulcer and Hepatitis fulminant, highlighting mechanistic insights into those ADRs. We have generated predictive models to help to assess the likelihood of ADRs in testing novel pharmaceutical compounds. We believe that our findings offer a promising approach for ADR prediction and will be useful for researchers in drug discovery.


Author(s):  
Ly Vu ◽  
Quang Uy Nguyen

Machine learning-based intrusion detection hasbecome more popular in the research community thanks to itscapability in discovering unknown attacks. To develop a gooddetection model for an intrusion detection system (IDS) usingmachine learning, a great number of attack and normal datasamples are required in the learning process. While normaldata can be relatively easy to collect, attack data is muchrarer and harder to gather. Subsequently, IDS datasets areoften dominated by normal data and machine learning modelstrained on those imbalanced datasets are ineffective in detect-ing attacks. In this paper, we propose a novel solution to thisproblem by using generative adversarial networks to generatesynthesized attack data for IDS. The synthesized attacks aremerged with the original data to form the augmented dataset.Three popular machine learning techniques are trained on theaugmented dataset. The experiments conducted on the threecommon IDS datasets and one our own dataset show thatmachine learning algorithms achieve better performance whentrained on the augmented dataset of the generative adversarialnetworks compared to those trained on the original datasetand other sampling techniques. The visualization techniquewas also used to analyze the properties of the synthesizeddata of the generative adversarial networks and the others.


2021 ◽  
Vol 13 (1) ◽  
Author(s):  
Andrew E. Blanchard ◽  
Christopher Stanley ◽  
Debsindhu Bhowmik

AbstractThe process of drug discovery involves a search over the space of all possible chemical compounds. Generative Adversarial Networks (GANs) provide a valuable tool towards exploring chemical space and optimizing known compounds for a desired functionality. Standard approaches to training GANs, however, can result in mode collapse, in which the generator primarily produces samples closely related to a small subset of the training data. In contrast, the search for novel compounds necessitates exploration beyond the original data. Here, we present an approach to training GANs that promotes incremental exploration and limits the impacts of mode collapse using concepts from Genetic Algorithms. In our approach, valid samples from the generator are used to replace samples from the training data. We consider both random and guided selection along with recombination during replacement. By tracking the number of novel compounds produced during training, we show that updates to the training data drastically outperform the traditional approach, increasing potential applications for GANs in drug discovery.


Entropy ◽  
2021 ◽  
Vol 23 (4) ◽  
pp. 467
Author(s):  
Daniel Heredia-Ductram ◽  
Miguel Nunez-del-Prado ◽  
Hugo Alatrista-Salas

In the last decades, the development of interconnectivity, pervasive systems, citizen sensors, and Big Data technologies allowed us to gather many data from different sources worldwide. This phenomenon has raised privacy concerns around the globe, compelling states to enforce data protection laws. In parallel, privacy-enhancing techniques have emerged to meet regulation requirements allowing companies and researchers to exploit individual data in a privacy-aware way. Thus, data curators need to find the most suitable algorithms to meet a required trade-off between utility and privacy. This crucial task could take a lot of time since there is a lack of benchmarks on privacy techniques. To fill this gap, we compare classical approaches of privacy techniques like Statistical Disclosure Control and Differential Privacy techniques to more recent techniques such as Generative Adversarial Networks and Machine Learning Copies using an entire commercial database in the current effort. The obtained results allow us to show the evolution of privacy techniques and depict new uses of the privacy-aware Machine Learning techniques.


2021 ◽  
Author(s):  
Raj chaganti ◽  
vinayakumar R ◽  
Mamoun Alazab ◽  
Tuan Pham

<div>Malware distribution to the victim network is commonly performed through file attachments in phishing email or downloading illegitimate files from the internet, when the victim interacts with the source of infection. To detect and prevent the malware distribution in the victim machine, the existing end device security applications may leverage sophisticated techniques such as signature-based or anomaly-based, machine learning techniques. The well-known file formats Portable Executable (PE) for Windows and Executable and Linkable Format (ELF) for Linux based operating system are used for malware analysis and the malware detection capabilities of these files has been well advanced for real time detection. But the malware payload hiding in multimedia like cover images using steganography detection has been a challenge for enterprises, as these are rarely seen and usually act as a stager in sophisticated attacks. In this article, to our knowledge, we are the first to try to address the knowledge gap between the current progress in image steganography and steganalysis academic research focusing on data hiding and the review of the stegomalware (malware payload hiding in images) targeting enterprises with cyberattacks current status. We present the stegomalware history, generation tools, file format specification description. Based on our findings, we perform the detail review of the image steganography techniques including the recent Generative Adversarial Networks (GAN) based models and the image steganalysis methods including the Deep Learning opportunities and challenges in stegomalware generation and detection are presented based on our findings.</div>


2021 ◽  
Author(s):  
Arnabi Bej ◽  
Ujjwal Maulik ◽  
Anasua Sarkar

Abstract Probabilistic Regression is a statistical technique and a crucial problem in the machine learning domain which employs a set of machine learning methods to forecast a continuous target variable based on the value of one or multiple predictor variables. COVID-19 is a virulent virus that has brought the whole world to a standstill. The potential of the virus to cause inter human transmission makes the world a dangerous place. This thesis predicts the upcoming circumstances of the Corona virus to subside its action. We have performed Conditional GAN regression to anticipate the subsequent Covid-19 cases of 5 countries. The GAN variant CGAN is used to design the model and predict the Covid-19 cases for three months ahead with least error for the dataset provided. Each country is examined individually, due to their variation in population size, tradition, medical manage- ment, preventive measures. The analysis is based on confirmed data, as provided by the World Health Organization. This paper investigates how conditional Generative Adversarial Networks (GANs) can be used to accurately exhibit intricate conditional distributions. GANs have got spectacular achievement in producing convoluted highdimensional data, but work done on their use for regression prob- lems is minimal. This paper exhibits how conditional GANs can be employed in probabilistic regression. It is shown that conditional GANs can be used to evaluate a wide range of various distributions and be competitive with existing probabilistic regression models.


2020 ◽  
Vol 10 (4) ◽  
pp. 1449
Author(s):  
Hansoo Lee ◽  
Jonggeun Kim ◽  
Eun Kyeong Kim ◽  
Sungshin Kim

Ground-based weather radar can observe a wide range with a high spatial and temporal resolution. They are beneficial to meteorological research and services by providing valuable information. Recent weather radar data related research has focused on applying machine learning and deep learning to solve complicated problems. It is a well-known fact that an adequate amount of data is a positively necessary condition in machine learning and deep learning. Generative adversarial networks (GANs) have received extensive attention for their remarkable data generation capacity, with a fascinating competitive structure having been proposed since. Consequently, a massive number of variants have been proposed; which model is adequate to solve the given problem is an inevitable concern. In this paper, we propose exploring the problem of radar image synthesis and evaluating different GANs with authentic radar observation results. The experimental results showed that the improved Wasserstein GAN is more capable of generating similar radar images while achieving higher structural similarity results.


2021 ◽  
Vol 22 (1) ◽  
Author(s):  
Khushnood Abbas ◽  
Alireza Abbasi ◽  
Shi Dong ◽  
Ling Niu ◽  
Laihang Yu ◽  
...  

Abstract Background Technological and research advances have produced large volumes of biomedical data. When represented as a network (graph), these data become useful for modeling entities and interactions in biological and similar complex systems. In the field of network biology and network medicine, there is a particular interest in predicting results from drug–drug, drug–disease, and protein–protein interactions to advance the speed of drug discovery. Existing data and modern computational methods allow to identify potentially beneficial and harmful interactions, and therefore, narrow drug trials ahead of actual clinical trials. Such automated data-driven investigation relies on machine learning techniques. However, traditional machine learning approaches require extensive preprocessing of the data that makes them impractical for large datasets. This study presents wide range of machine learning methods for predicting outcomes from biomedical interactions and evaluates the performance of the traditional methods with more recent network-based approaches. Results We applied a wide range of 32 different network-based machine learning models to five commonly available biomedical datasets, and evaluated their performance based on three important evaluations metrics namely AUROC, AUPR, and F1-score. We achieved this by converting link prediction problem as binary classification problem. In order to achieve this we have considered the existing links as positive example and randomly sampled negative examples from non-existant set. After experimental evaluation we found that Prone, ACT and $$LRW_5$$ L R W 5 are the top 3 best performers on all five datasets. Conclusions This work presents a comparative evaluation of network-based machine learning algorithms for predicting network links, with applications in the prediction of drug-target and drug–drug interactions, and applied well known network-based machine learning methods. Our work is helpful in guiding researchers in the appropriate selection of machine learning methods for pharmaceutical tasks.


2021 ◽  
Author(s):  
Raj chaganti ◽  
vinayakumar R ◽  
Mamoun Alazab ◽  
Tuan Pham

<div>Malware distribution to the victim network is commonly performed through file attachments in phishing email or downloading illegitimate files from the internet, when the victim interacts with the source of infection. To detect and prevent the malware distribution in the victim machine, the existing end device security applications may leverage sophisticated techniques such as signature-based or anomaly-based, machine learning techniques. The well-known file formats Portable Executable (PE) for Windows and Executable and Linkable Format (ELF) for Linux based operating system are used for malware analysis and the malware detection capabilities of these files has been well advanced for real time detection. But the malware payload hiding in multimedia like cover images using steganography detection has been a challenge for enterprises, as these are rarely seen and usually act as a stager in sophisticated attacks. In this article, to our knowledge, we are the first to try to address the knowledge gap between the current progress in image steganography and steganalysis academic research focusing on data hiding and the review of the stegomalware (malware payload hiding in images) targeting enterprises with cyberattacks current status. We present the stegomalware history, generation tools, file format specification description. Based on our findings, we perform the detail review of the image steganography techniques including the recent Generative Adversarial Networks (GAN) based models and the image steganalysis methods including the Deep Learning opportunities and challenges in stegomalware generation and detection are presented based on our findings.</div>


Sign in / Sign up

Export Citation Format

Share Document