Black Box Search Space Profiling for Accelerator-Aware Neural Architecture Search

Author(s):  
Shulin Zeng ◽  
Hanbo Sun ◽  
Yu Xing ◽  
Xuefei Ning ◽  
Yi Shan ◽  
...  
2021 ◽  
Vol 54 (4) ◽  
pp. 1-34
Author(s):  
Pengzhen Ren ◽  
Yun Xiao ◽  
Xiaojun Chang ◽  
Po-yao Huang ◽  
Zhihui Li ◽  
...  

Deep learning has made substantial breakthroughs in many fields due to its powerful automatic representation capabilities. It has been proven that neural architecture design is crucial to the feature representation of data and the final performance. However, the design of the neural architecture heavily relies on the researchers’ prior knowledge and experience. And due to the limitations of humans’ inherent knowledge, it is difficult for people to jump out of their original thinking paradigm and design an optimal model. Therefore, an intuitive idea would be to reduce human intervention as much as possible and let the algorithm automatically design the neural architecture. Neural Architecture Search ( NAS ) is just such a revolutionary algorithm, and the related research work is complicated and rich. Therefore, a comprehensive and systematic survey on the NAS is essential. Previously related surveys have begun to classify existing work mainly based on the key components of NAS: search space, search strategy, and evaluation strategy. While this classification method is more intuitive, it is difficult for readers to grasp the challenges and the landmark work involved. Therefore, in this survey, we provide a new perspective: beginning with an overview of the characteristics of the earliest NAS algorithms, summarizing the problems in these early NAS algorithms, and then providing solutions for subsequent related research work. In addition, we conduct a detailed and comprehensive analysis, comparison, and summary of these works. Finally, we provide some possible future research directions.


2020 ◽  
Vol 10 (11) ◽  
pp. 3712
Author(s):  
Dongjing Shan ◽  
Xiongwei Zhang ◽  
Wenhua Shi ◽  
Li Li

Regarding the sequence learning of neural networks, there exists a problem of how to capture long-term dependencies and alleviate the gradient vanishing phenomenon. To manage this problem, we proposed a neural network with random connections via a scheme of a neural architecture search. First, a dense network was designed and trained to construct a search space, and then another network was generated by random sampling in the space, whose skip connections could transmit information directly over multiple periods and capture long-term dependencies more efficiently. Moreover, we devised a novel cell structure that required less memory and computational power than the structures of long short-term memories (LSTMs), and finally, we performed a special initialization scheme on the cell parameters, which could permit unhindered gradient propagation on the time axis at the beginning of training. In the experiments, we evaluated four sequential tasks: adding, copying, frequency discrimination, and image classification; we also adopted several state-of-the-art methods for comparison. The experimental results demonstrated that our proposed model achieved the best performance.


IEEE Access ◽  
2021 ◽  
pp. 1-1
Author(s):  
Keith G. Mills ◽  
Mohammad Salameh ◽  
Di Niu ◽  
Fred X. Han ◽  
Seyed Saeed Changiz Rezaei ◽  
...  

Author(s):  
Liqun Wang ◽  
Songqing Shan ◽  
G. Gary Wang

The presence of black-box functions in engineering design, which are usually computation-intensive, demands efficient global optimization methods. This work proposes a new global optimization method for black-box functions. The global optimization method is based on a novel mode-pursuing sampling (MPS) method which systematically generates more sample points in the neighborhood of the function mode while statistically covers the entire search space. Quadratic regression is performed to detect the region containing the global optimum. The sampling and detection process iterates until the global optimum is obtained. Through intensive testing, this method is found to be effective, efficient, robust, and applicable to both continuous and discontinuous functions. It supports simultaneous computation and applies to both unconstrained and constrained optimization problems. Because it does not call any existing global optimization tool, it can be used as a standalone global optimization method for inexpensive problems as well. Limitation of the method is also identified and discussed.


2020 ◽  
Vol 34 (05) ◽  
pp. 9242-9249
Author(s):  
Yujing Wang ◽  
Yaming Yang ◽  
Yiren Chen ◽  
Jing Bai ◽  
Ce Zhang ◽  
...  

Learning text representation is crucial for text classification and other language related tasks. There are a diverse set of text representation networks in the literature, and how to find the optimal one is a non-trivial problem. Recently, the emerging Neural Architecture Search (NAS) techniques have demonstrated good potential to solve the problem. Nevertheless, most of the existing works of NAS focus on the search algorithms and pay little attention to the search space. In this paper, we argue that the search space is also an important human prior to the success of NAS in different applications. Thus, we propose a novel search space tailored for text representation. Through automatic search, the discovered network architecture outperforms state-of-the-art models on various public datasets on text classification and natural language inference tasks. Furthermore, some of the design principles found in the automatic network agree well with human intuition.


2015 ◽  
Vol 23 (4) ◽  
pp. 641-670 ◽  
Author(s):  
Benjamin Doerr ◽  
Carola Doerr ◽  
Timo Kötzing

We analyze the unbiased black-box complexities of jump functions with small, medium, and large sizes of the fitness plateau surrounding the optimal solution. Among other results, we show that when the jump size is [Formula: see text], that is, when only a small constant fraction of the fitness values is visible, then the unbiased black-box complexities for arities 3 and higher are of the same order as those for the simple OneMax function. Even for the extreme jump function, in which all but the two fitness values [Formula: see text] and n are blanked out, polynomial time mutation-based (i.e., unary unbiased) black-box optimization algorithms exist. This is quite surprising given that for the extreme jump function almost the whole search space (all but a [Formula: see text] fraction) is a plateau of constant fitness. To prove these results, we introduce new tools for the analysis of unbiased black-box complexities, for example, selecting the new parent individual not only by comparing the fitnesses of the competing search points but also by taking into account the (empirical) expected fitnesses of their offspring.


2021 ◽  
Vol 11 (23) ◽  
pp. 11436
Author(s):  
Ha Yoon Song

The current evolution of deep learning requires further optimization in terms of accuracy and time. From the perspective of new requirements, AutoML is an area that could provide possible solutions. AutoML has a neural architecture search (NAS) field. DARTS is a widely used approach in NAS and is based on gradient descent; however, it has some drawbacks. In this study, we attempted to overcome some of the drawbacks of DARTS by improving the accuracy and decreasing the search cost. The DARTS algorithm uses a mixed operation that combines all operations in the search space. The architecture parameter of each operation comprising a mixed operation is trained using gradient descent, and the operation with the largest architecture parameter is selected. The use of a mixed operation causes a problem called vote dispersion: similar operations share architecture parameters during gradient descent; thus, there are cases where the most important operation is disregarded. In this selection process, vote dispersion causes DARTS performance to degrade. To cope with this problem, we propose a new algorithm based on DARTS called DG-DARTS. Two search stages are introduced, and the clustering of operations is applied in DG-DARTS. In summary, DG-DARTS achieves an error rate of 2.51% on the CIFAR10 dataset, and its search cost is 0.2 GPU days because the search space of the second stage is reduced by half. The speed-up factor of DG-DARTS to DARTS is 6.82, which indicates that the search cost of DG-DARTS is only 13% that of DARTS.


Sign in / Sign up

Export Citation Format

Share Document