A Method for Gradient Differentiable Network Architecture Search by Selecting and Clustering Candidate Operations

The current evolution of deep learning requires further optimization in terms of accuracy and time. From the perspective of new requirements, AutoML is an area that could provide possible solutions. AutoML has a neural architecture search (NAS) field. DARTS is a widely used approach in NAS and is based on gradient descent; however, it has some drawbacks. In this study, we attempted to overcome some of the drawbacks of DARTS by improving the accuracy and decreasing the search cost. The DARTS algorithm uses a mixed operation that combines all operations in the search space. The architecture parameter of each operation comprising a mixed operation is trained using gradient descent, and the operation with the largest architecture parameter is selected. The use of a mixed operation causes a problem called vote dispersion: similar operations share architecture parameters during gradient descent; thus, there are cases where the most important operation is disregarded. In this selection process, vote dispersion causes DARTS performance to degrade. To cope with this problem, we propose a new algorithm based on DARTS called DG-DARTS. Two search stages are introduced, and the clustering of operations is applied in DG-DARTS. In summary, DG-DARTS achieves an error rate of 2.51% on the CIFAR10 dataset, and its search cost is 0.2 GPU days because the search space of the second stage is reduced by half. The speed-up factor of DG-DARTS to DARTS is 6.82, which indicates that the search cost of DG-DARTS is only 13% that of DARTS.

Download Full-text

Designing deep neural networks for continual learning in an open world

10.21248/gups.62487 ◽

2021 ◽

Author(s):

◽

Martin Mundt

Keyword(s):

Neural Network ◽

Machine Learning ◽

Deep Learning ◽

Network Architecture ◽

Neural Network Training ◽

Neural Network Architecture ◽

Neural Architecture ◽

Network Training ◽

Classification Tasks ◽

Continual Learning

Deep learning with neural networks seems to have largely replaced traditional design of computer vision systems. Automated methods to learn a plethora of parameters are now used in favor of previously practiced selection of explicit mathematical operators for a specific task. The entailed promise is that practitioners no longer need to take care of every individual step, but rather focus on gathering big amounts of data for neural network training. As a consequence, both a shift in mindset towards a focus on big datasets, as well as a wave of conceivable applications based exclusively on deep learning can be observed. This PhD dissertation aims to uncover some of the only implicitly mentioned or overlooked deep learning aspects, highlight unmentioned assumptions, and finally introduce methods to address respective immediate weaknesses. In the author’s humble opinion, these prevalent shortcomings can be tied to the fact that the involved steps in the machine learning workflow are frequently decoupled. Success is predominantly measured based on accuracy measures designed for evaluation with static benchmark test sets. Individual machine learning workflow components are assessed in isolation with respect to available data, choice of neural network architecture, and a particular learning algorithm, rather than viewing the machine learning system as a whole in context of a particular application. Correspondingly, in this dissertation, three key challenges have been identified: 1. Choice and flexibility of a neural network architecture. 2. Identification and rejection of unseen unknown data to avoid false predictions. 3. Continual learning without forgetting of already learned information. These latter challenges have already been crucial topics in older literature, alas, seem to require a renaissance in modern deep learning literature. Initially, it may appear that they pose independent research questions, however, the thesis posits that the aspects are intertwined and require a joint perspective in machine learning based systems. In summary, the essential question is thus how to pick a suitable neural network architecture for a specific task, how to recognize which data inputs belong to this context, which ones originate from potential other tasks, and ultimately how to continuously include such identified novel data in neural network training over time without overwriting existing knowledge. Thus, the central emphasis of this dissertation is to build on top of existing deep learning strengths, yet also acknowledge mentioned weaknesses, in an effort to establish a deeper understanding of interdependencies and synergies towards the development of unified solution mechanisms. For this purpose, the main portion of the thesis is in cumulative form. The respective publications can be grouped according to the three challenges outlined above. Correspondingly, chapter 1 is focused on choice and extendability of neural network architectures, analyzed in context of popular image classification tasks. An algorithm to automatically determine neural network layer width is introduced and is first contrasted with static architectures found in the literature. The importance of neural architecture design is then further showcased on a real-world application of defect detection in concrete bridges. Chapter 2 is comprised of the complementary ensuing questions of how to identify unknown concepts and subsequently incorporate them into continual learning. A joint central mechanism to distinguish unseen concepts from what is known in classification tasks, while enabling consecutive training without forgetting or revisiting older classes, is proposed. Once more, the role of the chosen neural network architecture is quantitatively reassessed. Finally, chapter 3 culminates in an overarching view, where developed parts are connected. Here, an extensive survey further serves the purpose to embed the gained insights in the broader literature landscape and emphasizes the importance of a common frame of thought. The ultimately presented approach thus reflects the overall thesis’ contribution to advance neural network based machine learning towards a unified solution that ties together choice of neural architecture with the ability to learn continually and the capability to automatically separate known from unknown data.

Download Full-text

TextNAS: A Neural Architecture Search Space Tailored for Text Representation

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v34i05.6462 ◽

2020 ◽

Vol 34 (05) ◽

pp. 9242-9249

Author(s):

Yujing Wang ◽

Yaming Yang ◽

Yiren Chen ◽

Jing Bai ◽

Ce Zhang ◽

...

Keyword(s):

Text Classification ◽

Network Architecture ◽

State Of The Art ◽

Search Space ◽

Search Algorithms ◽

Text Representation ◽

Neural Architecture ◽

Good Potential ◽

Automatic Search ◽

Public Datasets

Learning text representation is crucial for text classification and other language related tasks. There are a diverse set of text representation networks in the literature, and how to find the optimal one is a non-trivial problem. Recently, the emerging Neural Architecture Search (NAS) techniques have demonstrated good potential to solve the problem. Nevertheless, most of the existing works of NAS focus on the search algorithms and pay little attention to the search space. In this paper, we argue that the search space is also an important human prior to the success of NAS in different applications. Thus, we propose a novel search space tailored for text representation. Through automatic search, the discovered network architecture outperforms state-of-the-art models on various public datasets on text classification and natural language inference tasks. Furthermore, some of the design principles found in the automatic network agree well with human intuition.

Download Full-text

AMBIENT: Accelerated Convolutional Neural Network Architecture Search for Regulatory Genomics

10.1101/2021.02.25.432960 ◽

2021 ◽

Author(s):

Zijun Zhang ◽

Evan M. Cofer ◽

Olga G. Troyanskaya

Keyword(s):

Neural Network ◽

Neural Networks ◽

Deep Learning ◽

Convolutional Neural Network ◽

Network Architecture ◽

Environmental Issue ◽

Biological Sequences ◽

Neural Network Architecture ◽

Computing Power ◽

Neural Architecture

Convolutional neural networks (CNN) have become a standard approach for modeling genomic sequences. CNNs can be effectively built by Neural Architecture Search (NAS) by trading computing power for accurate neural architectures. Yet, the consumption of immense computing power is a major practical, financial, and environmental issue for deep learning. Here, we present a novel NAS framework, AMBIENT, that generates highly accurate CNN architectures for biological sequences of diverse functions, while substantially reducing the computing cost of conventional NAS.

Download Full-text

Generative Adversarial Neural Architecture Search

Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2021/307 ◽

2021 ◽

Author(s):

Seyed Saeed Changiz Rezaei ◽

Fred X. Han ◽

Di Niu ◽

Mohammad Salameh ◽

Keith Mills ◽

...

Keyword(s):

Deep Learning ◽

Reinforcement Learning ◽

Importance Sampling ◽

Ad Hoc ◽

Search Space ◽

Learning Approach ◽

Adversarial Learning ◽

Search Spaces ◽

Neural Architecture ◽

Empirical Success

Despite the empirical success of neural architecture search (NAS) in deep learning applications, the optimality, reproducibility and cost of NAS schemes remain hard to assess. In this paper, we propose Generative Adversarial NAS (GA-NAS) with theoretically provable convergence guarantees, promoting stability and reproducibility in neural architecture search. Inspired by importance sampling, GA-NAS iteratively fits a generator to previously discovered top architectures, thus increasingly focusing on important parts of a large search space. Furthermore, we propose an efficient adversarial learning approach, where the generator is trained by reinforcement learning based on rewards provided by a discriminator, thus being able to explore the search space without evaluating a large number of architectures. Extensive experiments show that GA-NAS beats the best published results under several cases on three public NAS benchmarks. In the meantime, GA-NAS can handle ad-hoc search constraints and search spaces. We show that GA-NAS can be used to improve already optimized baselines found by other NAS methods, including EfficientNet and ProxylessNAS, in terms of ImageNet accuracy or the number of parameters, in their original search space.

Download Full-text

Genetic Algorithm Based Deep Learning Neural Network Structure and Hyperparameter Optimization

Applied Sciences ◽

10.3390/app11020744 ◽

2021 ◽

Vol 11 (2) ◽

pp. 744

Author(s):

Sanghyeop Lee ◽

Junyeob Kim ◽

Hyeon Kang ◽

Do-Young Kang ◽

Jangsik Park

Keyword(s):

Alzheimer’S Disease ◽

Genetic Algorithm ◽

Alzheimer's Disease ◽

Deep Learning ◽

Network Structure ◽

Network Architecture ◽

High Performance ◽

Search Space ◽

Disease Diagnosis ◽

Research Area

Alzheimer’s disease is one of the major challenges of population ageing, and diagnosis and prediction of the disease through various biomarkers is the key. While the application of deep learning as imaging technologies has recently expanded across the medical industry, empirical design of these technologies is very difficult. The main reason for this problem is that the performance of the Convolutional Neural Networks (CNN) differ greatly depending on the statistical distribution of the input dataset. Different hyperparameters also greatly affect the convergence of the CNN models. With this amount of information, selecting appropriate parameters for the network structure has became a large research area. Genetic Algorithm (GA), is a very popular technique to automatically select a high-performance network architecture. In this paper, we show the possibility of optimising the network architecture using GA, where its search space includes both network structure configuration and hyperparameters. To verify the performance of our Algorithm, we used an amyloid brain image dataset that is used for Alzheimer’s disease diagnosis. As a result, our algorithm outperforms Genetic CNN by 11.73% on a given classification task.

Download Full-text

MergeNAS: Merge Operations into One for Differentiable Architecture Search

Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2020/424 ◽

2020 ◽

Author(s):

Xiaoxing Wang ◽

Chao Xue ◽

Junchi Yan ◽

Xiaokang Yang ◽

Yonggang Hu ◽

...

Keyword(s):

Mathematical Formulation ◽

Search Space ◽

Search Method ◽

Test Accuracy ◽

Search Cost ◽

Neural Architecture ◽

Different Types ◽

Search Approach ◽

Memory Utilization

Differentiable architecture search (DARTS) has been a promising one-shot architecture search approach for its mathematical formulation and competitive results. However, besides its caused high memory utilization and a large computation requirement, many research works have shown that DARTS also often suffers notable over-fitting and thus does not work robustly for some new tasks. In this paper, we propose a one-shot neural architecture search method referred to as MergeNAS by merging different types of operations e.g. convolutions into one operation. This merge-based approach not only reduces the search cost (about half a GPU day), but also alleviates over-fitting by reducing the redundant parameters. Extensive experiments on different search space and various datasets have been conducted to verify our approach, showing that MergeNAS can converge to a stable architecture and achieve better performance with fewer parameters and search cost. For test accuracy and its stability, MergeNAS outperforms all NAS baseline methods implemented on NAS-Bench-201, including DARTS, ENAS, RS, BOHB, GDAS and hand-crafted ResNet.

Download Full-text

One-Shot Neural Architecture Search by Dynamically Pruning Supernet in Hierarchical Order

International Journal of Neural Systems ◽

10.1142/s0129065721500295 ◽

2021 ◽

pp. 2150029

Author(s):

Jianwei Zhang ◽

Dong Li ◽

Lituan Wang ◽

Lei Zhang

Keyword(s):

State Of The Art ◽

Search Space ◽

Experimental Results ◽

Research Interest ◽

Search Process ◽

Search Efficiency ◽

Neural Architecture ◽

Speed Up ◽

The One

Neural Architecture Search (NAS), which aims at automatically designing neural architectures, recently draw a growing research interest. Different from conventional NAS methods, in which a large number of neural architectures need to be trained for evaluation, the one-shot NAS methods only have to train one supernet which synthesizes all the possible candidate architectures. As a result, the search efficiency could be significantly improved by sharing the supernet’s weights during the candidate architectures’ evaluation. This strategy could greatly speed up the search process but suffer a challenge that the evaluation based on sharing weights is not predictive enough. Recently, pruning the supernet during the search has been proven to be an efficient way to alleviate this problem. However, the pruning direction in complex-structured search space remains unexplored. In this paper, we revisited the role of path dropout strategy, which drops the neural operations instead of the neurons, in supernet training, and several interesting characters of the supernet trained with dropout are found. Based on the observations, a Hierarchically-Ordered Pruning Neural Architecture Search (HOPNAS) algorithm is proposed by dynamically pruning the supernet with a proper pruning direction. Experimental results indicate that our method is competitive with state-of-the-art approaches on CIFAR10 and ImageNet.

Download Full-text

Cross-Domain Scene Classification Based on a Spatial Generalized Neural Architecture Search for High Spatial Resolution Remote Sensing Images

Remote Sensing ◽

10.3390/rs13173460 ◽

2021 ◽

Vol 13 (17) ◽

pp. 3460

Author(s):

Yuling Chen ◽

Wentao Teng ◽

Zhen Li ◽

Qiqi Zhu ◽

Qingfeng Guan

Keyword(s):

Remote Sensing ◽

Deep Learning ◽

Network Design ◽

Spatial Resolution ◽

High Spatial Resolution ◽

Search Space ◽

Scene Classification ◽

Neural Architecture ◽

Cross Domain

By labelling high spatial resolution (HSR) images with specific semantic classes according to geographical properties, scene classification has been proven to be an effective method for HSR remote sensing image semantic interpretation. Deep learning is widely applied in HSR remote sensing scene classification. Most of the scene classification methods based on deep learning assume that the training datasets and the test datasets come from the same datasets or obey similar feature distributions. However, in practical application scenarios, it is difficult to guarantee this assumption. For new datasets, it is time-consuming and labor-intensive to repeat data annotation and network design. The neural architecture search (NAS) can automate the process of redesigning the baseline network. However, traditional NAS lacks the generalization ability to different settings and tasks. In this paper, a novel neural network search architecture framework—the spatial generalization neural architecture search (SGNAS) framework—is proposed. This model applies the NAS of spatial generalization to cross-domain scene classification of HSR images to bridge the domain gap. The proposed SGNAS can automatically search the architecture suitable for HSR image scene classification and possesses network design principles similar to the manually designed networks, which can make the obtained network migrate to different tasks. To obtain a simple and low-dimensional search space, the traditional NAS search space was optimized and the human-the-loop method was used. To extend the optimized search space to different tasks, the search space was generalized. The experimental results demonstrate that the network searched by the SGNAS framework with good generalization ability displays its effectiveness for cross-domain scene classification of HSR images, both in accuracy and time efficiency.

Download Full-text

Graph Neural Architecture Search

Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2020/195 ◽

2020 ◽

Cited By ~ 2

Author(s):

Yang Gao ◽

Hong Yang ◽

Peng Zhang ◽

Chuan Zhou ◽

Yue Hu

Keyword(s):

Neural Networks ◽

Network Architecture ◽

Domain Knowledge ◽

Search Space ◽

Recurrent Network ◽

Validation Data ◽

Data Set ◽

Neural Architecture ◽

Real World Datasets ◽

Graph Neural Networks

Graph neural networks (GNNs) emerged recently as a powerful tool for analyzing non-Euclidean data such as social network data. Despite their success, the design of graph neural networks requires heavy manual work and domain knowledge. In this paper, we present a graph neural architecture search method (GraphNAS) that enables automatic design of the best graph neural architecture based on reinforcement learning. Specifically, GraphNAS uses a recurrent network to generate variable-length strings that describe the architectures of graph neural networks, and trains the recurrent network with policy gradient to maximize the expected accuracy of the generated architectures on a validation data set. Furthermore, to improve the search efficiency of GraphNAS on big networks, GraphNAS restricts the search space from an entire architecture space to a sequential concatenation of the best search results built on each single architecture layer. Experiments on real-world datasets demonstrate that GraphNAS can design a novel network architecture that rivals the best human-invented architecture in terms of validation set accuracy. Moreover, in a transfer learning task we observe that graph neural architectures designed by GraphNAS, when transferred to new datasets, still gain improvement in terms of prediction accuracy.

Download Full-text

Drug-Drug Interaction Predicting by Neural Network Using Integrated Similarity

Scientific Reports ◽

10.1038/s41598-019-50121-3 ◽

2019 ◽

Vol 9 (1) ◽

Cited By ~ 10

Author(s):

Narjes Rohani ◽

Changiz Eslahchi

Keyword(s):

Neural Network ◽

Drug Interaction ◽

Side Effect ◽

Network Architecture ◽

Selection Process ◽

Superior Performance ◽

Multiple Drug ◽

Interaction Prediction ◽

Benchmark Datasets ◽

Drug Drug Interaction

Abstract Drug-Drug Interaction (DDI) prediction is one of the most critical issues in drug development and health. Proposing appropriate computational methods for predicting unknown DDI with high precision is challenging. We proposed "NDD: Neural network-based method for drug-drug interaction prediction" for predicting unknown DDIs using various information about drugs. Multiple drug similarities based on drug substructure, target, side effect, off-label side effect, pathway, transporter, and indication data are calculated. At first, NDD uses a heuristic similarity selection process and then integrates the selected similarities with a nonlinear similarity fusion method to achieve high-level features. Afterward, it uses a neural network for interaction prediction. The similarity selection and similarity integration parts of NDD have been proposed in previous studies of other problems. Our novelty is to combine these parts with new neural network architecture and apply these approaches in the context of DDI prediction. We compared NDD with six machine learning classifiers and six state-of-the-art graph-based methods on three benchmark datasets. NDD achieved superior performance in cross-validation with AUPR ranging from 0.830 to 0.947, AUC from 0.954 to 0.994 and F-measure from 0.772 to 0.902. Moreover, cumulative evidence in case studies on numerous drug pairs, further confirm the ability of NDD to predict unknown DDIs. The evaluations corroborate that NDD is an efficient method for predicting unknown DDIs. The data and implementation of NDD are available at https://github.com/nrohani/NDD.

Download Full-text