Critical Assessment of Artificial Intelligence Methods for Prediction of hERG Channel Inhibition in the ‘Big Data’ Era

The rise of novel artificial intelligence methods necessitates a comparison of this wave of new approaches with classical machine learning for a typical drug discovery project. Inhibition of the potassium ion channel, whose alpha subunit is encoded by human Ether-à-go-go-Related Gene (hERG), leads to prolonged QT interval of the cardiac action potential and is a significant safety pharmacology target for the development of new medicines. Several computational approaches have been employed to develop prediction models for assessment of hERG liabilities of small molecules including recent work using deep learning methods. Here we perform a comprehensive comparison of prediction models based on classical (random forests and gradient boosting) and modern (deep neural networks and recurrent neural networks) artificial intelligence methods. The training set (~9000 compounds) was compiled by integrating hERG bioactivity data from ChEMBL database with experimental data generated from an in-house, high-throughput thallium flux assay. We utilized different molecular descriptors including the latent descriptors, which are real-valued continuous vectors derived from chemical autoencoders trained on a large chemical space (> 1.5 million compounds). The models were prospectively validated on ~840 in-house compounds screened in the same thallium flux assay. The deep neural networks performed significantly better than the classical methods with the latent descriptors. The recurrent neural networks that operate on SMILES provided highest model sensitivity. The best models were merged into a consensus model that offered superior performance compared to reference models from academic and commercial domains. Further, we shed light on the potential of artificial intelligence methods to exploit the chemistry big data and generate novel chemical representations useful in predictive modeling and tailoring new chemical space.<br>

Download Full-text

Critical Assessment of Artificial Intelligence Methods for Prediction of hERG Channel Inhibition in the ‘Big Data’ Era

10.26434/chemrxiv.12119040.v1 ◽

2020 ◽

Author(s):

Vishal Babu Siramshetty ◽

Dac-Trung Nguyen ◽

Natalia J. Martinez ◽

Anton Simeonov ◽

Noel T. Southall ◽

...

Keyword(s):

Artificial Intelligence ◽

Neural Networks ◽

Big Data ◽

Recurrent Neural Networks ◽

Deep Neural Networks ◽

Prediction Models ◽

Chemical Space ◽

Superior Performance ◽

Gradient Boosting ◽

Artificial Intelligence Methods

Download Full-text

Enabling deeper learning on big data for materials informatics applications

Scientific Reports ◽

10.1038/s41598-021-83193-1 ◽

2021 ◽

Vol 11 (1) ◽

Author(s):

Dipendra Jha ◽

Vishu Gupta ◽

Logan Ward ◽

Zijiang Yang ◽

Christopher Wolverton ◽

...

Keyword(s):

Neural Networks ◽

Big Data ◽

Deep Learning ◽

Deep Neural Networks ◽

Materials Science ◽

Prediction Models ◽

Model Performance ◽

Materials Informatics ◽

Learning Framework ◽

Significant Attention

AbstractThe application of machine learning (ML) techniques in materials science has attracted significant attention in recent years, due to their impressive ability to efficiently extract data-driven linkages from various input materials representations to their output properties. While the application of traditional ML techniques has become quite ubiquitous, there have been limited applications of more advanced deep learning (DL) techniques, primarily because big materials datasets are relatively rare. Given the demonstrated potential and advantages of DL and the increasing availability of big materials datasets, it is attractive to go for deeper neural networks in a bid to boost model performance, but in reality, it leads to performance degradation due to the vanishing gradient problem. In this paper, we address the question of how to enable deeper learning for cases where big materials data is available. Here, we present a general deep learning framework based on Individual Residual learning (IRNet) composed of very deep neural networks that can work with any vector-based materials representation as input to build accurate property prediction models. We find that the proposed IRNet models can not only successfully alleviate the vanishing gradient problem and enable deeper learning, but also lead to significantly (up to 47%) better model accuracy as compared to plain deep neural networks and traditional ML techniques for a given input materials representation in the presence of big data.

Download Full-text

Chatbots Employing Deep Learning for Big Data

International Journal of Innovative Technology and Exploring Engineering - Special Issue ◽

10.35940/ijitee.i8017.0981119 ◽

2019 ◽

Vol 8 (11) ◽

pp. 1005-1010

Keyword(s):

Artificial Intelligence ◽

Neural Networks ◽

Big Data ◽

Deep Learning ◽

Natural Language Processing ◽

Language Processing ◽

Deep Neural Networks ◽

Heterogeneous Data ◽

Instructive Feedback ◽

Technical Specifications

With the evolution of artificial intelligence to deep learning, the age of perspicacious machines has pioneered that can even mimic as a human. A Conversational software agent is one of the best-suited examples of such intuitive machines which are also commonly known as chatbot actuated with natural language processing. The paper enlisted some existing popular chatbots along with their details, technical specifications, and functionalities. Research shows that most of the customers have experienced penurious service. Also, the inception of meaningful cum instructive feedback endure a demanding and exigent assignment as enactment for chatbots builtout reckon mostly upon templates and hand-written rules. Current chatbot models lack in generating required responses and thus contradict the quality conversation. So involving deep learning amongst these models can overcome this lack and can fill up the paucity with deep neural networks. Some of the deep Neural networks utilized for this till now are Stacked Auto-Encoder, sparse auto-encoders, predictive sparse and denoising auto-encoders. But these DNN are unable to handle big data involving large amounts of heterogeneous data. While Tensor Auto Encoder which overcomes this drawback is time-consuming. This paper has proposed the Chatbot to handle the big data in a manageable time.

Download Full-text

Survey of Applications of Neural Networks and Machine Learning to COVID-19 Predictions

10.4018/978-1-7998-8455-2.ch002 ◽

2022 ◽

pp. 30-57

Author(s):

Richard S. Segall

Keyword(s):

Neural Network ◽

Artificial Intelligence ◽

Machine Learning ◽

Neural Networks ◽

Big Data ◽

Recurrent Neural Networks ◽

Big Data Analytics ◽

Data Sets ◽

Open Source Data ◽

Source Data

The purpose of this chapter is to illustrate how artificial intelligence (AI) technologies have been used for COVID-19 detection and analysis. Specifically, the use of neural networks (NN) and machine learning (ML) are described along with which countries are creating these techniques and how these are being used for COVID-19 diagnosis and detection. Illustrations of multi-layer convolutional neural networks (CNN), recurrent neural networks (RNN), and deep neural networks (DNN) are provided to show how these are used for COVID-19 detection and prediction. A summary of big data analytics for COVID-19 and some available COVID-19 open-source data sets and repositories and their characteristics for research and analysis are also provided. An example is also shown for artificial intelligence (AI) and neural network (NN) applications using real-time COVID-19 data.

Download Full-text

Explainable Artificial Intelligence for Bias Detection in COVID CT-Scan Classifiers

Sensors ◽

10.3390/s21165657 ◽

2021 ◽

Vol 21 (16) ◽

pp. 5657

Author(s):

Iam Palatnik de Sousa ◽

Marley M. B. R. Vellasco ◽

Eduardo Costa da Silva

Keyword(s):

Artificial Intelligence ◽

Neural Networks ◽

Ct Scan ◽

Image Classification ◽

Deep Neural Networks ◽

Performance Metrics ◽

Classification Task ◽

Artificial Intelligence Methods ◽

Explainable Artificial Intelligence ◽

Strong Performance

Problem: An application of Explainable Artificial Intelligence Methods for COVID CT-Scan classifiers is presented. Motivation: It is possible that classifiers are using spurious artifacts in dataset images to achieve high performances, and such explainable techniques can help identify this issue. Aim: For this purpose, several approaches were used in tandem, in order to create a complete overview of the classificatios. Methodology: The techniques used included GradCAM, LIME, RISE, Squaregrid, and direct Gradient approaches (Vanilla, Smooth, Integrated). Main results: Among the deep neural networks architectures evaluated for this image classification task, VGG16 was shown to be most affected by biases towards spurious artifacts, while DenseNet was notably more robust against them. Further impacts: Results further show that small differences in validation accuracies can cause drastic changes in explanation heatmaps for DenseNet architectures, indicating that small changes in validation accuracy may have large impacts on the biases learned by the networks. Notably, it is important to notice that the strong performance metrics achieved by all these networks (Accuracy, F1 score, AUC all in the 80 to 90% range) could give users the erroneous impression that there is no bias. However, the analysis of the explanation heatmaps highlights the bias.

Download Full-text

Predicting the epidemic curve of the coronavirus (SARS-CoV-2) disease (COVID-19) using artificial intelligence

10.1101/2020.04.17.20069666 ◽

2020 ◽

Cited By ~ 4

Author(s):

László Róbert Kolozsvári ◽

Tamás Bérczes ◽

András Hajdu ◽

Rudolf Gesztelyi ◽

Attila Tiba ◽

...

Keyword(s):

Artificial Intelligence ◽

Neural Networks ◽

Recurrent Neural Networks ◽

Prediction Models ◽

Short Term Memory ◽

Epidemiological Data ◽

World Health ◽

Training Dataset ◽

Current Form ◽

Epidemic Curve

AbstractObjectivesThe current form of severe acute respiratory syndrome called coronavirus disease 2019 (COVID-19) caused by a coronavirus (SARS-CoV-2) is a major global health problem. The aim of our study was to use the official epidemiological data and predict the possible outcomes of the COVID-19 pandemic using artificial intelligence (AI)-based RNNs (Recurrent Neural Networks), then compare and validate the predicted and observed data.Materials and MethodsWe used the publicly available datasets of World Health Organization and Johns Hopkins University to create the training dataset, then have used recurrent neural networks (RNNs) with gated recurring units (Long Short-Term Memory – LSTM units) to create 2 Prediction Models. Information collected in the first t time-steps were aggregated with a fully connected (dense) neural network layer and a consequent regression output layer to determine the next predicted value. We used root mean squared logarithmic errors (RMSLE) to compare the predicted and observed data, then recalculated the predictions again.ResultsThe result of our study underscores that the COVID-19 pandemic is probably a propagated source epidemic, therefore repeated peaks on the epidemic curve (rise of the daily number of the newly diagnosed infections) are to be anticipated. The errors between the predicted and validated data and trends seems to be low.ConclusionsThe influence of this pandemic is great worldwide, impact our everyday lifes. Especially decision makers must be aware, that even if strict public health measures are executed and sustained, future peaks of infections are possible. The AI-based predictions might be useful tools for predictions and the models can be recalculated according to the new observed data, to get more precise forecast of the pandemic.

Download Full-text

Exploring Chemical Space with Machine Learning

CHIMIA International Journal for Chemistry ◽

10.2533/chimia.2019.1018 ◽

2019 ◽

Vol 73 (12) ◽

pp. 1018-1023 ◽

Cited By ~ 3

Author(s):

Josep Arús-Pous ◽

Mahendra Awale ◽

Daniel Probst ◽

Jean-Louis Reymond

Keyword(s):

Machine Learning ◽

Neural Networks ◽

Big Data ◽

Molecular Diversity ◽

Deep Neural Networks ◽

Chemical Space ◽

Fragment Size ◽

Small Sample ◽

Computational Tools ◽

Sample Set

Chemical space is a concept to organize molecular diversity by postulating that different molecules occupy different regions of a mathematical space where the position of each molecule is defined by its properties. Our aim is to develop methods to explicitly explore chemical space in the area of drug discovery. Here we review our implementations of machine learning in this project, including our use of deep neural networks to enumerate the GDB13 database from a small sample set, to generate analogs of drugs and natural products after training with fragment-size molecules, and to predict the polypharmacology of molecules after training with known bioactive compounds from ChEMBL. We also discuss visualization methods for big data as means to keep track and learn from machine learning results. Computational tools discussed in this review are freely available at http://gdb.unibe.ch and https://github.com/reymond-group.

Download Full-text

Artificial Intelligence for anomaly detection by employing deep learning strategies in IoT networks using the trendy IoT-23 big data from Google's Tensorflow2.2

10.21203/rs.3.rs-364763/v1 ◽

2021 ◽

Author(s):

Kanimozhi V ◽

T. Prem Jacob

Keyword(s):

Artificial Intelligence ◽

Neural Networks ◽

Big Data ◽

Deep Learning ◽

Anomaly Detection ◽

Multilayer Perceptron ◽

Network Traffic ◽

Recurrent Neural Networks ◽

Learning Models ◽

Research Article

Abstract Although numerous profound learning models have been proposed, this research article contributed to symbolize the investigation of artificial deep learning models on sensible IoT gadgets to perform online protection in IoT network traffic by using the realistic IoT-23 dataset. This dataset is a recent network traffic dataset generated from the real-time network traffic data of IoT appliances. IoT products are utilized in various program applications such as home, commercial, mechanization, and various forms of wearable technologies. IoT security is more critical than network security because of its massive attack surface and multiplied weak spots of IoT gadgets. Universally, the general amount of IoT gadgets conveyed by 2025 is foreseen to achieve 41600 million. Henceforth, IoT anomaly detection systems based on the realistic Iot-23 big data for detecting IoT-based attacks on the artificial neural networks of Convolutional Neural Networks (CNN), Recurrent Neural Networks (RNN), and Multilayer perceptron (MLP) in IoT- cybersecurity has implemented and executed in this research article. As a result, Convolutional Neural Networks produces an outstanding performance of metric accuracy score is 0.998234, and minimal loss function is 0.008842, compare to Multilayer perceptron and Recurrent Neural Networks in IoT Anomaly Detection. Also generated well-displayed graph plots of Model_Accuracy, Learning curve of artificial Intelligence deep learning algorithms such as MLP, CNN, and RNN.

Download Full-text

An efficient pruning scheme of deep neural networks for Internet of Things applications

EURASIP Journal on Advances in Signal Processing ◽

10.1186/s13634-021-00744-4 ◽

2021 ◽

Vol 2021 (1) ◽

Author(s):

Chen Qi ◽

Shibo Shen ◽

Rongpeng Li ◽

Zhifeng Zhao ◽

Qing Liu ◽

...

Keyword(s):

Neural Network ◽

Neural Networks ◽

Internet Of Things ◽

Deep Neural Networks ◽

Computational Cost ◽

Superior Performance ◽

Compact Structure ◽

Resource Limited ◽

Benchmark Datasets ◽

Iot Devices

AbstractNowadays, deep neural networks (DNNs) have been rapidly deployed to realize a number of functionalities like sensing, imaging, classification, recognition, etc. However, the computational-intensive requirement of DNNs makes it difficult to be applicable for resource-limited Internet of Things (IoT) devices. In this paper, we propose a novel pruning-based paradigm that aims to reduce the computational cost of DNNs, by uncovering a more compact structure and learning the effective weights therein, on the basis of not compromising the expressive capability of DNNs. In particular, our algorithm can achieve efficient end-to-end training that transfers a redundant neural network to a compact one with a specifically targeted compression rate directly. We comprehensively evaluate our approach on various representative benchmark datasets and compared with typical advanced convolutional neural network (CNN) architectures. The experimental results verify the superior performance and robust effectiveness of our scheme. For example, when pruning VGG on CIFAR-10, our proposed scheme is able to significantly reduce its FLOPs (floating-point operations) and number of parameters with a proportion of 76.2% and 94.1%, respectively, while still maintaining a satisfactory accuracy. To sum up, our scheme could facilitate the integration of DNNs into the common machine-learning-based IoT framework and establish distributed training of neural networks in both cloud and edge.

Download Full-text

Prediction Models for Truck Accidents at Freeway Ramps in Washington State Using Regression and Artificial Intelligence Techniques

Transportation Research Record Journal of the Transportation Research Board ◽

10.3141/1635-04 ◽

1998 ◽

Vol 1635 (1) ◽

pp. 30-36 ◽

Cited By ~ 9

Author(s):

Wael H. Awad ◽

Bruce N. Janson

Keyword(s):

Artificial Intelligence ◽

Neural Networks ◽

Linear Regression ◽

Prediction Models ◽

Washington State ◽

Training Data ◽

Coefficient Of Determination ◽

Training Process ◽

Truck Accidents ◽

High Level

Three different modeling approaches were applied to explain truck accidents at interchanges in Washington State during a 27-month period. Three models were developed for each ramp type including linear regression, neural networks, and a hybrid system using fuzzy logic and neural networks. The study showed that linear regression was able to predict accident frequencies that fell within one standard deviation from the overall mean of the dependent variable. However, the coefficient of determination was very low in all cases. The other two artificial intelligence (AI) approaches showed a high level of performance in identifying different patterns of accidents in the training data and presented a better fit when compared to the regression model. However, the ability of these AI models to predict test data that were not included in the training process showed unsatisfactory results.

Download Full-text