Separable Fully Connected Layers Improve Deep Learning Models For Genomics

Mapping Intimacies ◽

10.1101/146431 ◽

2017 ◽

Cited By ~ 4

Author(s):

Amr Mohamed Alexandari ◽

Avanti Shrikumar ◽

Anshul Kundaje

Keyword(s):

Neural Networks ◽

Deep Neural Networks ◽

Biologically Inspired ◽

Outer Product ◽

Link Type ◽

Current State ◽

Conventional Models ◽

Model Setup ◽

Fully Connected

ABSTRACTConvolutional neural networks are rapidly gaining popularity in regulatory genomics. Typically, these networks have a stack of convolutional and pooling layers, followed by one or more fully connected layers. In genomics, the same positional patterns are often present across multiple convolutional channels. Therefore, in current state-of-the-art networks, there exists significant redundancy in the representations learned by standard fully connected layers. We present a new separable fully connected layer that learns a weights tensor that is the outer product of positional weights and cross-channel weights, thereby allowing the same positional patterns to be applied across multiple convolutional channels. Decomposing positional and cross-channel weights further enables us to readily impose biologically-inspired constraints on positional weights, such as symmetry. We also propose a novel regularizer and constraint that act on curvature in the positional weights. Using experiments on simulated and in vivo datasets, we show that networks that incorporate our separable fully connected layer outperform conventional models with analogous architectures and the same number of parameters. Additionally, our networks are more robust to hyperparameter tuning, have more informative gradients, and produce importance scores that are more consistent with known biology than conventional deep neural networks.AvailabilityImplementation: https://github.com/kundajelab/keras/tree/keras_1A gist illustrating model setup is at: goo.gl/gYooaa

Get full-text (via PubEx)

Modeling positional effects of regulatory sequences with spline transformations increases prediction accuracy of deep neural networks

10.1101/165183 ◽

2017 ◽

Cited By ~ 1

Author(s):

Žiga Avsec ◽

Mohammadamin Barekatain ◽

Jun Cheng ◽

Julien Gagneur

Keyword(s):

Neural Network ◽

Neural Networks ◽

Deep Learning ◽

Prediction Accuracy ◽

Deep Neural Networks ◽

Rna Binding ◽

Piecewise Linear ◽

Regulatory Sequences ◽

Link Type

AbstractMotivationRegulatory sequences are not solely defined by their nucleic acid sequence but also by their relative distances to genomic landmarks such as transcription start site, exon boundaries, or polyadenylation site. Deep learning has become the approach of choice for modeling regulatory sequences because of its strength to learn complex sequence features. However, modeling relative distances to genomic landmarks in deep neural networks has not been addressed.ResultsHere we developed spline transformation, a neural network module based on splines to flexibly and robustly model distances. Modeling distances to various genomic landmarks with spline transformations significantly increased state-of-the-art prediction accuracy of in vivo RNA-binding protein binding sites for 114 out of 123 proteins. We also developed a deep neural network for human splice branchpoint based on spline transformations that outperformed the current best, already distance-based, machine learning model. Compared to piecewise linear transformation, as obtained by composition of rectified linear units, spline transformation yields higher prediction accuracy as well as faster and more robust training. As spline transformation can be applied to further quantities beyond distances, such as methylation or conservation, we foresee it as a versatile component in the genomics deep learning toolbox.AvailabilitySpline transformation is implemented as a Keras layer in the CONCISE python package: https://github.com/gagneurlab/concise. Analysis code is available at goo.gl/[email protected]; [email protected]

Get full-text (via PubEx)

Evolving deep neural networks for Time Series Forecasting

Learning and Nonlinear Models ◽

10.21528/lnlm-vol18-no2-art4 ◽

2021 ◽

Vol 18 (2) ◽

pp. 40-55

Author(s):

Lídio Mauro Lima Campos ◽

◽

Jherson Haryson Almeida Pereira ◽

Danilo Souza Duarte ◽

Roberto Célio Limão Oliveira ◽

...

Keyword(s):

Genetic Algorithm ◽

Neural Networks ◽

Time Series ◽

Deep Neural Networks ◽

Good Prediction ◽

Biologically Inspired ◽

Price Prediction ◽

Error Measures ◽

Good Ability ◽

Proposed Model

The aim of this paper is to introduce a biologically inspired approach that can automatically generate Deep Neural networks with good prediction capacity, smaller error and large tolerance to noises. In order to do this, three biological paradigms were used: Genetic Algorithm (GA), Lindenmayer System and Neural Networks (DNNs). The final sections of the paper present some experiments aimed at investigating the possibilities of the method in the forecast the price of energy in the Brazilian market. The proposed model considers a multi-step ahead price prediction (12, 24, and 36 weeks ahead). The results for MLP and LSTM networks show a good ability to predict peaks and satisfactory accuracy according to error measures comparing with other methods.

Get full-text (via PubEx)

Detecting Emotions in English and Arabic Tweets

Information ◽

10.3390/info10030098 ◽

2019 ◽

Vol 10 (3) ◽

pp. 98 ◽

Cited By ~ 4

Author(s):

Tariq Ahmad ◽

Allan Ramsay ◽

Hanady Ahmed

Keyword(s):

Machine Learning ◽

Neural Networks ◽

Deep Neural Networks ◽

State Of The Art ◽

Learning Algorithms ◽

General Purpose ◽

Machine Learning Algorithms ◽

Current State ◽

Optimal Thresholds ◽

Alternative Approach

Assigning sentiment labels to documents is, at first sight, a standard multi-label classification task. Many approaches have been used for this task, but the current state-of-the-art solutions use deep neural networks (DNNs). As such, it seems likely that standard machine learning algorithms, such as these, will provide an effective approach. We describe an alternative approach, involving the use of probabilities to construct a weighted lexicon of sentiment terms, then modifying the lexicon and calculating optimal thresholds for each class. We show that this approach outperforms the use of DNNs and other standard algorithms. We believe that DNNs are not a universal panacea and that paying attention to the nature of the data that you are trying to learn from can be more important than trying out ever more powerful general purpose machine learning algorithms.

Get full-text (via PubEx)

Accelerating Deep Neural Networks by Combining Block-Circulant Matrices and Low-Precision Weights

Electronics ◽

10.3390/electronics8010078 ◽

2019 ◽

Vol 8 (1) ◽

pp. 78 ◽

Cited By ~ 1

Author(s):

Zidi Qin ◽

Di Zhu ◽

Xingwei Zhu ◽

Xuan Chen ◽

Yinghuan Shi ◽

...

Keyword(s):

Neural Networks ◽

Deep Neural Networks ◽

Circulant Matrix ◽

Circulant Matrices ◽

Matrix Vector Multiplication ◽

Processing Power ◽

Measurement Results ◽

Block Circulant Matrix ◽

Storage Complexity ◽

Fully Connected

As a key ingredient of deep neural networks (DNNs), fully-connected (FC) layers are widely used in various artificial intelligence applications. However, there are many parameters in FC layers, so the efficient process of FC layers is restricted by memory bandwidth. In this paper, we propose a compression approach combining block-circulant matrix-based weight representation and power-of-two quantization. Applying block-circulant matrices in FC layers can reduce the storage complexity from O ( k 2 ) to O ( k ) . By quantizing the weights into integer powers of two, the multiplications in the reference can be replaced by shift and add operations. The memory usages of models for MNIST, CIFAR-10 and ImageNet can be compressed by 171 × , 2731 × and 128 × with minimal accuracy loss, respectively. A configurable parallel hardware architecture is then proposed for processing the compressed FC layers efficiently. Without multipliers, a block matrix-vector multiplication module (B-MV) is used as the computing kernel. The architecture is flexible to support FC layers of various compression ratios with small footprint. Simultaneously, the memory access can be significantly reduced by using the configurable architecture. Measurement results show that the accelerator has a processing power of 409.6 GOPS, and achieves 5.3 TOPS/W energy efficiency at 800 MHz.

Get full-text (via PubEx)

MOLI: multi-omics late integration with deep neural networks for drug response prediction

Bioinformatics ◽

10.1093/bioinformatics/btz318 ◽

2019 ◽

Vol 35 (14) ◽

pp. i501-i509 ◽

Cited By ~ 28

Author(s):

Hossein Sharifi-Noghabi ◽

Olga Zolotareva ◽

Colin C Collins ◽

Martin Ester

Keyword(s):

Gene Expression ◽

Neural Networks ◽

Prediction Accuracy ◽

Drug Response ◽

Deep Neural Networks ◽

Response Prediction ◽

Supplementary Information ◽

Precision Oncology

Abstract Motivation Historically, gene expression has been shown to be the most informative data for drug response prediction. Recent evidence suggests that integrating additional omics can improve the prediction accuracy which raises the question of how to integrate the additional omics. Regardless of the integration strategy, clinical utility and translatability are crucial. Thus, we reasoned a multi-omics approach combined with clinical datasets would improve drug response prediction and clinical relevance. Results We propose MOLI, a multi-omics late integration method based on deep neural networks. MOLI takes somatic mutation, copy number aberration and gene expression data as input, and integrates them for drug response prediction. MOLI uses type-specific encoding sub-networks to learn features for each omics type, concatenates them into one representation and optimizes this representation via a combined cost function consisting of a triplet loss and a binary cross-entropy loss. The former makes the representations of responder samples more similar to each other and different from the non-responders, and the latter makes this representation predictive of the response values. We validate MOLI on in vitro and in vivo datasets for five chemotherapy agents and two targeted therapeutics. Compared to state-of-the-art single-omics and early integration multi-omics methods, MOLI achieves higher prediction accuracy in external validations. Moreover, a significant improvement in MOLI’s performance is observed for targeted drugs when training on a pan-drug input, i.e. using all the drugs with the same target compared to training only on drug-specific inputs. MOLI’s high predictive power suggests it may have utility in precision oncology. Availability and implementation https://github.com/hosseinshn/MOLI. Supplementary information Supplementary data are available at Bioinformatics online.

Get full-text (via PubEx)

A novel layerwise pruning method for model reduction of fully connected deep neural networks

2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) ◽

10.1109/icassp.2017.7952583 ◽

2017 ◽

Cited By ~ 4

Author(s):

Lukas Mauch ◽

Bin Yang

Keyword(s):

Neural Networks ◽

Model Reduction ◽

Deep Neural Networks ◽

Pruning Method ◽

Fully Connected

Get full-text (via PubEx)

A HYBRID MODEL USING THE PRETRAINED BERT AND DEEP NEURAL NETWORKS WITH RICH FEATURE FOR EXTRACTIVE TEXT SUMMARIZATION

Journal of Computer Science and Cybernetics ◽

10.15625/1813-9663/37/2/15980 ◽

2021 ◽

Vol 37 (2) ◽

pp. 123-143

Author(s):

Tuan Minh Luu ◽

Huong Thanh Le ◽

Tan Minh Hoang

Keyword(s):

Neural Network ◽

Neural Networks ◽

Deep Neural Networks ◽

Text Summarization ◽

Training Dataset ◽

Extractive Summarization ◽

Input Text ◽

Summarization System ◽

Fully Connected

Deep neural networks have been applied successfully to extractive text summarization tasks with the accompany of large training datasets. However, when the training dataset is not large enough, these models reveal certain limitations that affect the quality of the system’s summary. In this paper, we propose an extractive summarization system basing on a Convolutional Neural Network and a Fully Connected network for sentence selection. The pretrained BERT multilingual model is used to generate embeddings vectors from the input text. These vectors are combined with TF-IDF values to produce the input of the text summarization system. Redundant sentences from the output summary are eliminated by the Maximal Marginal Relevance method. Our system is evaluated with both English and Vietnamese languages using CNN and Baomoi datasets, respectively. Experimental results show that our system achieves better results comparing to existing works using the same dataset. It confirms that our approach can be effectively applied to summarize both English and Vietnamese languages.

Get full-text (via PubEx)

Regularizing Deep Neural Networks with an Ensemble-based Decorrelation Method

Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2018/301 ◽

2018 ◽

Author(s):

Shuqin Gu ◽

Yuexian Hou ◽

Lipeng Zhang ◽

Yazhou Zhang

Keyword(s):

Neural Networks ◽

Ensemble Learning ◽

Convolutional Neural Networks ◽

Deep Neural Networks ◽

Experimental Results ◽

Excellent Performance ◽

Hidden Layer ◽

Base Learner ◽

Fully Connected

Although Deep Neural Networks (DNNs) have achieved excellent performance in many tasks, improving the generalization capacity of DNNs still remains a challenge. In this work, we propose a novel regularizer named Ensemble-based Decorrelation Method (EDM), which is motivated by the idea of the ensemble learning to improve generalization capacity of DNNs. EDM can be applied to hidden layers in fully connected neural networks or convolutional neural networks. We treat each hidden layer as an ensemble of several base learners through dividing all the hidden units into several non-overlap groups, and each group will be viewed as a base learner. EDM encourages DNNs to learn more diverse representations by minimizing the covariance between all base learners during the training step. Experimental results on MNIST and CIFAR datasets demonstrate that EDM can effectively reduce the overfitting and improve the generalization capacity of DNNs

Get full-text (via PubEx)

FNN and Auto Encoder Deep Learning-Based Algorithm for Android Cyber Security

International Journal of Recent Technology and Engineering - 2 ◽

10.35940/ijrte.e6454.018520 ◽

2020 ◽

Vol 8 (5) ◽

pp. 3292-3296

Keyword(s):

Neural Networks ◽

Deep Learning ◽

Deep Neural Networks ◽

Detection System ◽

Malware Detection ◽

Cyber Attacks ◽

Open Architecture ◽

Android Malware ◽

Android Malware Detection ◽

Fully Connected

Android is susceptible to malware attacks due to its open architecture, large user base and access to its code. Mobile or android malware attacks are increasing from last year. These are common threats for every internet-accessible device. From Researchers Point of view 50% increase in cyber-attacks targeting Android Mobile phones since last year. Malware attackers increasingly turning their attention to attacking smartphones with credential-theft, surveillance, and malicious advertising. Security investigation in the android mobile system has relied on analysis for malware or threat detection using binary samples or system calls with behavior profile for malicious applications is generated and then analyzed. The resulting report is then used to detect android application malware or threats using manual features. To dispose of malicious applications in the mobile device, we propose an Android malware detection system using deep learning techniques which gives security for mobile or android. FNN(Fully-connected FeedForward Deep Neural Networks) and AutoEncoder algorithm from deep learning provide Extensive experiments on a real-world dataset that reaches to an accuracy of 95 %. These papers explain Deep learning FNN(Fully-connected FeedForward Deep Neural Networks) and AutoEncoder approach for android malware detection.

Get full-text (via PubEx)

A Smoothed LASSO Based DNN Sparsification Technique

10.36227/techrxiv.13488720.v1 ◽

2020 ◽

Author(s):

Nitin Chandrachoodan ◽

Basava Naga Girish Koneru ◽

Vinita Vasudevan

Keyword(s):

Neural Networks ◽

Deep Neural Networks ◽

Maximum Error ◽

Pruning Algorithm ◽

Operator Functions ◽

Selection Operator ◽

Fully Connected

<div>Deep Neural Networks (DNNs) are increasingly being used in a variety of applications. However, DNNs have huge computational and memory requirements. One way to reduce these requirements is to sparsify DNNs by using smoothed LASSO (Least Absolute Shrinkage and Selection Operator) functions. In this paper, we show that for the same maximum error with respect to the LASSO function, the sparsity values obtained using various smoothed LASSO functions are similar. We also propose a layer-wise DNN pruning algorithm, where the layers are pruned based on their individual allocated accuracy loss budget determined by estimates of the reduction in number of multiply-accumulate operations (in convolutional layers) and weights (in fully connected layers). Further, the structured LASSO variants in both convolutional and fully connected layers are explored within the smoothed LASSO framework and the tradeoffs involved are discussed. The efficacy of proposed algorithm in enhancing the sparsity within the allowed degradation in DNN accuracy and results obtained on structured LASSO variants are shown on MNIST, SVHN, CIFAR-10, and Imagenette datasets.</div>

Get full-text (via PubEx)