Approximate Large-scale Multiple Kernel k-means Using Deep Neural Network

Proceedings of the Twenty-Sixth International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2017/419 ◽

2017 ◽

Cited By ~ 1

Author(s):

Yueqing Wang ◽

Xinwang Liu ◽

Yong Dou ◽

Rongchun Li

Keyword(s):

Neural Network ◽

Large Scale ◽

Deep Neural Network ◽

Computational Cost ◽

Approximate Algorithm ◽

Small Subset ◽

Great Success ◽

Data Set ◽

Multiple Kernel ◽

Multiple Kernel Clustering

Multiple kernel clustering (MKC) algorithms have been extensively studied and applied to various applications. Although they demonstrate great success in both the theoretical aspects and applications, existing MKC algorithms cannot be applied to large-scale clustering tasks due to: i) the heavy computational cost to calculate the base kernels; and ii) insufficient memory to load the kernel matrices. In this paper, we propose an approximate algorithm to overcome these issues, and to make it be applicable to large-scale applications. Specifically, our algorithm trains a deep neural network to regress the indicating matrix generated by MKC algorithms on a small subset, and then obtains the approximate indicating matrix of the whole data set using the trained network, and finally performs the $k$-means on the output of our network. By mapping features into indicating matrix directly, our algorithm avoids computing the full kernel matrices, which dramatically decreases the memory requirement. Extensive experiments show that our algorithm consumes less time than most comparatively similar algorithms, while it achieves comparable performance with MKC algorithms.

OnionNet-2: A Convolutional Neural Network Model for Predicting Protein-Ligand Binding Affinity Based on Residue-Atom Contacting Shells

Frontiers in Chemistry ◽

10.3389/fchem.2021.753002 ◽

2021 ◽

Vol 9 ◽

Author(s):

Zechen Wang ◽

Liangzhen Zheng ◽

Yang Liu ◽

Yuanyuan Qu ◽

Yong-Qiang Li ◽

...

Keyword(s):

Neural Network ◽

Ligand Binding ◽

Convolutional Neural Network ◽

Binding Affinity ◽

Binding Free Energy ◽

Computational Cost ◽

Scoring Function ◽

Quality Data ◽

Great Success ◽

Data Set

One key task in virtual screening is to accurately predict the binding affinity (△G) of protein-ligand complexes. Recently, deep learning (DL) has significantly increased the predicting accuracy of scoring functions due to the extraordinary ability of DL to extract useful features from raw data. Nevertheless, more efforts still need to be paid in many aspects, for the aim of increasing prediction accuracy and decreasing computational cost. In this study, we proposed a simple scoring function (called OnionNet-2) based on convolutional neural network to predict △G. The protein-ligand interactions are characterized by the number of contacts between protein residues and ligand atoms in multiple distance shells. Compared to published models, the efficacy of OnionNet-2 is demonstrated to be the best for two widely used datasets CASF-2016 and CASF-2013 benchmarks. The OnionNet-2 model was further verified by non-experimental decoy structures from docking program and the CSAR NRC-HiQ data set (a high-quality data set provided by CSAR), which showed great success. Thus, our study provides a simple but efficient scoring function for predicting protein-ligand binding free energy.

ANI-1: an extensible neural network potential with DFT accuracy at force field computational cost

Chemical Science ◽

10.1039/c6sc05720a ◽

2017 ◽

Vol 8 (4) ◽

pp. 3192-3203 ◽

Cited By ~ 429

Author(s):

J. S. Smith ◽

O. Isayev ◽

A. E. Roitberg

Keyword(s):

Neural Network ◽

Force Field ◽

Deep Neural Network ◽

Organic Molecules ◽

Computational Cost ◽

Quantum Mechanical ◽

Data Set

We demonstrate how a deep neural network (NN) trained on a data set of quantum mechanical (QM) DFT calculated energies can learn an accurate and transferable atomistic potential for organic molecules containing H, C, N, and O atoms.

Deep Neural Network Based Predictions of Protein Interactions Using Primary Sequences

Molecules ◽

10.3390/molecules23081923 ◽

2018 ◽

Vol 23 (8) ◽

pp. 1923 ◽

Cited By ~ 20

Author(s):

Hang Li ◽

Xiu-Jun Gong ◽

Hua Yu ◽

Chang Zhou

Keyword(s):

Neural Network ◽

Protein Interactions ◽

Large Scale ◽

Deep Neural Network ◽

Short Term Memory ◽

Feature Engineering ◽

Great Success ◽

Learning Technology ◽

Short Term ◽

Related Sequence

Machine learning based predictions of protein–protein interactions (PPIs) could provide valuable insights into protein functions, disease occurrence, and therapy design on a large scale. The intensive feature engineering in most of these methods makes the prediction task more tedious and trivial. The emerging deep learning technology enabling automatic feature engineering is gaining great success in various fields. However, the over-fitting and generalization of its models are not yet well investigated in most scenarios. Here, we present a deep neural network framework (DNN-PPI) for predicting PPIs using features learned automatically only from protein primary sequences. Within the framework, the sequences of two interacting proteins are sequentially fed into the encoding, embedding, convolution neural network (CNN), and long short-term memory (LSTM) neural network layers. Then, a concatenated vector of the two outputs from the previous layer is wired as the input of the fully connected neural network. Finally, the Adam optimizer is applied to learn the network weights in a back-propagation fashion. The different types of features, including semantic associations between amino acids, position-related sequence segments (motif), and their long- and short-term dependencies, are captured in the embedding, CNN and LSTM layers, respectively. When the model was trained on Pan’s human PPI dataset, it achieved a prediction accuracy of 98.78% at the Matthew’s correlation coefficient (MCC) of 97.57%. The prediction accuracies for six external datasets ranged from 92.80% to 97.89%, making them superior to those achieved with previous methods. When performed on Escherichia coli, Drosophila, and Caenorhabditis elegans datasets, DNN-PPI obtained prediction accuracies of 95.949%, 98.389%, and 98.669%, respectively. The performances in cross-species testing among the four species above coincided in their evolutionary distances. However, when testing Mus Musculus using the models from those species, they all obtained prediction accuracies of over 92.43%, which is difficult to achieve and worthy of note for further study. These results suggest that DNN-PPI has remarkable generalization and is a promising tool for identifying protein interactions.

Adsorption Isotherm Predictions for Multiple Molecules in MOFs Using the Same Deep Learning Model

10.26434/chemrxiv.9894224.v1 ◽

2019 ◽

Author(s):

Ryther Anderson ◽

Achay Biong ◽

Diego Gómez-Gualdrón

Keyword(s):

Neural Network ◽

Machine Learning ◽

Molecular Simulation ◽

Large Scale ◽

Learning Model ◽

Operating Conditions ◽

Small Subset ◽

Screening Methods ◽

Large Set ◽

Metal Organic

<div>Tailoring the structure and chemistry of metal-organic frameworks (MOFs) enables the manipulation of their adsorption properties to suit specific energy and environmental applications. As there are millions of possible MOFs (with tens of thousands already synthesized), molecular simulation, such as grand canonical Monte Carlo (GCMC), has frequently been used to rapidly evaluate the adsorption performance of a large set of MOFs. This allows subsequent experiments to focus only on a small subset of the most promising MOFs. In many instances, however, even molecular simulation becomes prohibitively time consuming, underscoring the need for alternative screening methods, such as machine learning, to precede molecular simulation efforts. In this study, as a proof of concept, we trained a neural network as the first example of a machine learning model capable of predicting full adsorption isotherms of different molecules not included in the training of the model. To achieve this, we trained our neural network only on alchemical species, represented only by their geometry and force field parameters, and used this neural network to predict the loadings of real adsorbates. We focused on predicting room temperature adsorption of small (one- and two-atom) molecules relevant to chemical separations. Namely, argon, krypton, xenon, methane, ethane, and nitrogen. However, we also observed surprisingly promising predictions for more complex molecules, whose properties are outside the range spanned by the alchemical adsorbates. Prediction accuracies suitable for large-scale screening were achieved using simple MOF (e.g. geometric properties and chemical moieties), and adsorbate (e.g. forcefield parameters and geometry) descriptors. Our results illustrate a new philosophy of training that opens the path towards development of machine learning models that can predict the adsorption loading of any new adsorbate at any new operating conditions in any new MOF.</div>

An efficient technique for CT scan images classification of COVID-19

Journal of Intelligent & Fuzzy Systems ◽

10.3233/jifs-201985 ◽

2020 ◽

pp. 1-14

Author(s):

Esraa Hassan ◽

Noha A. Hikal ◽

Samir Elmuogy

Keyword(s):

Neural Network ◽

Diagnostic Tool ◽

Deep Neural Network ◽

Data Augmentation ◽

Performance Metrics ◽

Classification Model ◽

Data Set ◽

Training Models ◽

The Earth ◽

Diagnostic Time

Nowadays, Coronavirus (COVID-19) considered one of the most critical pandemics in the earth. This is due its ability to spread rapidly between humans as well as animals. COVID_19 expected to outbreak around the world, around 70 % of the earth population might infected with COVID-19 in the incoming years. Therefore, an accurate and efficient diagnostic tool is highly required, which the main objective of our study. Manual classification was mainly used to detect different diseases, but it took too much time in addition to the probability of human errors. Automatic image classification reduces doctors diagnostic time, which could save human’s life. We propose an automatic classification architecture based on deep neural network called Worried Deep Neural Network (WDNN) model with transfer learning. Comparative analysis reveals that the proposed WDNN model outperforms by using three pre-training models: InceptionV3, ResNet50, and VGG19 in terms of various performance metrics. Due to the shortage of COVID-19 data set, data augmentation was used to increase the number of images in the positive class, then normalization used to make all images have the same size. Experimentation is done on COVID-19 dataset collected from different cases with total 2623 where (1573 training,524 validation,524 test). Our proposed model achieved 99,046, 98,684, 99,119, 98,90 In terms of Accuracy, precision, Recall, F-score, respectively. The results are compared with both the traditional machine learning methods and those using Convolutional Neural Networks (CNNs). The results demonstrate the ability of our classification model to use as an alternative of the current diagnostic tool.

Deep Neural Network for Gender-Based Violence Detection on Twitter Messages

Mathematics ◽

10.3390/math9080807 ◽

2021 ◽

Vol 9 (8) ◽

pp. 807

Author(s):

Carlos M. Castorena ◽

Itzel M. Abundez ◽

Roberto Alejo ◽

Everardo E. Granda-Gutiérrez ◽

Eréndira Rendón ◽

...

Keyword(s):

Neural Network ◽

Deep Neural Network ◽

Gender Violence ◽

Strategic Decision ◽

Learning Approaches ◽

Gender Based Violence ◽

Data Set ◽

Violence Detection ◽

Governmental Institutions ◽

Gender Based

The problem of gender-based violence in Mexico has been increased considerably. Many social associations and governmental institutions have addressed this problem in different ways. In the context of computer science, some effort has been developed to deal with this problem through the use of machine learning approaches to strengthen the strategic decision making. In this work, a deep learning neural network application to identify gender-based violence on Twitter messages is presented. A total of 1,857,450 messages (generated in Mexico) were downloaded from Twitter: 61,604 of them were manually tagged by human volunteers as negative, positive or neutral messages, to serve as training and test data sets. Results presented in this paper show the effectiveness of deep neural network (about 80% of the area under the receiver operating characteristic) in detection of gender violence on Twitter messages. The main contribution of this investigation is that the data set was minimally pre-processed (as a difference versus most state-of-the-art approaches). Thus, the original messages were converted into a numerical vector in accordance to the frequency of word’s appearance and only adverbs, conjunctions and prepositions were deleted (which occur very frequently in text and we think that these words do not contribute to discriminatory messages on Twitter). Finally, this work contributes to dealing with gender violence in Mexico, which is an issue that needs to be faced immediately.

Large-Scale Water Quality Prediction with Integrated Deep Neural Network

Information Sciences ◽

10.1016/j.ins.2021.04.057 ◽

2021 ◽

Author(s):

Jing Bi ◽

Yongze Lin ◽

Quanxi Dong ◽

Haitao Yuan ◽

MengChu Zhou

Keyword(s):

Neural Network ◽

Water Quality ◽

Large Scale ◽

Deep Neural Network ◽

Quality Prediction ◽

Water Quality Prediction

RISC-V Virtual Platform-Based Convolutional Neural Network Accelerator Implemented in SystemC

Electronics ◽

10.3390/electronics10131514 ◽

2021 ◽

Vol 10 (13) ◽

pp. 1514

Author(s):

Seung-Ho Lim ◽

WoonSik William Suh ◽

Jin-Young Kim ◽

Sang-Young Cho

Keyword(s):

Neural Network ◽

Deep Learning ◽

Network Model ◽

Neural Network Model ◽

Deep Neural Network ◽

System Level ◽

Neural Network Models ◽

Data Set ◽

Embedded Device ◽

Virtual Platform

The optimization for hardware processor and system for performing deep learning operations such as Convolutional Neural Networks (CNN) in resource limited embedded devices are recent active research area. In order to perform an optimized deep neural network model using the limited computational unit and memory of an embedded device, it is necessary to quickly apply various configurations of hardware modules to various deep neural network models and find the optimal combination. The Electronic System Level (ESL) Simulator based on SystemC is very useful for rapid hardware modeling and verification. In this paper, we designed and implemented a Deep Learning Accelerator (DLA) that performs Deep Neural Network (DNN) operation based on the RISC-V Virtual Platform implemented in SystemC in order to enable rapid and diverse analysis of deep learning operations in an embedded device based on the RISC-V processor, which is a recently emerging embedded processor. The developed RISC-V based DLA prototype can analyze the hardware requirements according to the CNN data set through the configuration of the CNN DLA architecture, and it is possible to run RISC-V compiled software on the platform, can perform a real neural network model like Darknet. We performed the Darknet CNN model on the developed DLA prototype, and confirmed that computational overhead and inference errors can be analyzed with the DLA prototype developed by analyzing the DLA architecture for various data sets.

Image Classification Using Transfer Learning and Deep Learning

International Journal Of Engineering And Computer Science ◽

10.18535/ijecs/v10i9.4622 ◽

2021 ◽

Vol 10 (9) ◽

pp. 25394-25398

Author(s):

Chitra Desai

Keyword(s):

Neural Network ◽

Deep Learning ◽

Image Classification ◽

Transfer Learning ◽

Network Model ◽

Neural Network Model ◽

Large Scale ◽

Deep Neural Network ◽

Visual Recognition ◽

Classification Of Images

Deep learning models have demonstrated improved efficacy in image classification since the ImageNet Large Scale Visual Recognition Challenge started since 2010. Classification of images has further augmented in the field of computer vision with the dawn of transfer learning. To train a model on huge dataset demands huge computational resources and add a lot of cost to learning. Transfer learning allows to reduce on cost of learning and also help avoid reinventing the wheel. There are several pretrained models like VGG16, VGG19, ResNet50, Inceptionv3, EfficientNet etc which are widely used. This paper demonstrates image classification using pretrained deep neural network model VGG16 which is trained on images from ImageNet dataset. After obtaining the convolutional base model, a new deep neural network model is built on top of it for image classification based on fully connected network. This classifier will use features extracted from the convolutional base model.

Reduction of Computational Cost Using Two-Stage Deep Neural Network for Training for Denoising and Sound Source Identification

Trends in Applied Knowledge-Based Systems and Data Science - Lecture Notes in Computer Science ◽

10.1007/978-3-319-42007-3_49 ◽

2016 ◽

pp. 562-573 ◽

Cited By ~ 2

Author(s):

Takayuki Morito ◽

Osamu Sugiyama ◽

Satoshi Uemura ◽

Ryosuke Kojima ◽

Kazuhiro Nakadai

Keyword(s):

Neural Network ◽

Sound Source ◽

Source Identification ◽

Deep Neural Network ◽

Computational Cost ◽

Two Stage ◽

Sound Source Identification