NetAdaptV2: Efficient Neural Architecture Search with Fast Super-Network Training and Architecture Optimization

Deep learning with neural networks seems to have largely replaced traditional design of computer vision systems. Automated methods to learn a plethora of parameters are now used in favor of previously practiced selection of explicit mathematical operators for a specific task. The entailed promise is that practitioners no longer need to take care of every individual step, but rather focus on gathering big amounts of data for neural network training. As a consequence, both a shift in mindset towards a focus on big datasets, as well as a wave of conceivable applications based exclusively on deep learning can be observed. This PhD dissertation aims to uncover some of the only implicitly mentioned or overlooked deep learning aspects, highlight unmentioned assumptions, and finally introduce methods to address respective immediate weaknesses. In the author’s humble opinion, these prevalent shortcomings can be tied to the fact that the involved steps in the machine learning workflow are frequently decoupled. Success is predominantly measured based on accuracy measures designed for evaluation with static benchmark test sets. Individual machine learning workflow components are assessed in isolation with respect to available data, choice of neural network architecture, and a particular learning algorithm, rather than viewing the machine learning system as a whole in context of a particular application. Correspondingly, in this dissertation, three key challenges have been identified: 1. Choice and flexibility of a neural network architecture. 2. Identification and rejection of unseen unknown data to avoid false predictions. 3. Continual learning without forgetting of already learned information. These latter challenges have already been crucial topics in older literature, alas, seem to require a renaissance in modern deep learning literature. Initially, it may appear that they pose independent research questions, however, the thesis posits that the aspects are intertwined and require a joint perspective in machine learning based systems. In summary, the essential question is thus how to pick a suitable neural network architecture for a specific task, how to recognize which data inputs belong to this context, which ones originate from potential other tasks, and ultimately how to continuously include such identified novel data in neural network training over time without overwriting existing knowledge. Thus, the central emphasis of this dissertation is to build on top of existing deep learning strengths, yet also acknowledge mentioned weaknesses, in an effort to establish a deeper understanding of interdependencies and synergies towards the development of unified solution mechanisms. For this purpose, the main portion of the thesis is in cumulative form. The respective publications can be grouped according to the three challenges outlined above. Correspondingly, chapter 1 is focused on choice and extendability of neural network architectures, analyzed in context of popular image classification tasks. An algorithm to automatically determine neural network layer width is introduced and is first contrasted with static architectures found in the literature. The importance of neural architecture design is then further showcased on a real-world application of defect detection in concrete bridges. Chapter 2 is comprised of the complementary ensuing questions of how to identify unknown concepts and subsequently incorporate them into continual learning. A joint central mechanism to distinguish unseen concepts from what is known in classification tasks, while enabling consecutive training without forgetting or revisiting older classes, is proposed. Once more, the role of the chosen neural network architecture is quantitatively reassessed. Finally, chapter 3 culminates in an overarching view, where developed parts are connected. Here, an extensive survey further serves the purpose to embed the gained insights in the broader literature landscape and emphasizes the importance of a common frame of thought. The ultimately presented approach thus reflects the overall thesis’ contribution to advance neural network based machine learning towards a unified solution that ties together choice of neural architecture with the ability to learn continually and the capability to automatically separate known from unknown data.

Download Full-text

An adaptive neural architecture optimization model for retinal disorder diagnosis on 3D medical images

Applied Soft Computing ◽

10.1016/j.asoc.2021.107686 ◽

2021 ◽

pp. 107686

Author(s):

Haifeng Wang ◽

Daehan Won ◽

Sang Won Yoon

Keyword(s):

Optimization Model ◽

Medical Images ◽

Neural Architecture ◽

Architecture Optimization ◽

Retinal Disorder

Download Full-text

An Effective Neural Architecture Optimization Algorithm for CNN based on Search Economics

10.1145/3491396.3506505 ◽

2021 ◽

Author(s):

Yi-Ping Chen ◽

Chun-Wei Tsai

Keyword(s):

Optimization Algorithm ◽

Neural Architecture ◽

Architecture Optimization

Download Full-text

Design of silicon brains in the nano-CMOS era: Spiking neurons, learning synapses and neural architecture optimization

Neural Networks ◽

10.1016/j.neunet.2013.05.011 ◽

2013 ◽

Vol 45 ◽

pp. 4-26 ◽

Cited By ~ 65

Author(s):

Andrew S. Cassidy ◽

Julius Georgiou ◽

Andreas G. Andreou

Keyword(s):

Spiking Neurons ◽

Neural Architecture ◽

Architecture Optimization

Download Full-text

Network training and architecture optimization by a recursive approach and a modified genetic algorithm

Journal of Chemometrics ◽

10.1002/(sici)1099-128x(199605)10:3<253::aid-cem420>3.0.co;2-z ◽

1996 ◽

Vol 10 (3) ◽

pp. 253-267 ◽

Cited By ~ 19

Author(s):

Jian-Hui Jiang ◽

Ji-Hong Wang ◽

Xin-Hua Song ◽

Ru-Qin Yu

Keyword(s):

Genetic Algorithm ◽

Network Training ◽

Modified Genetic Algorithm ◽

Architecture Optimization

Download Full-text

A Compression-Compilation Framework for On-mobile Real-time BERT Applications

Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2021/712 ◽

2021 ◽

Author(s):

Wei Niu ◽

Zhenglun Kong ◽

Geng Yuan ◽

Weiwen Jiang ◽

Jiexiong Guan ◽

...

Keyword(s):

Real Time ◽

Mobile Devices ◽

Language Processing ◽

Question Answering ◽

Optimization Method ◽

High Accuracy ◽

Text Generation ◽

Learning Models ◽

Neural Architecture ◽

Architecture Optimization

Transformer-based deep learning models have increasingly demonstrated high accuracy on many natural language processing (NLP) tasks. In this paper, we propose a compression-compilation co-design framework that can guarantee the identified model meets both resource and real-time specifications of mobile devices. Our framework applies a compiler-aware neural architecture optimization method (CANAO), which can generate the optimal compressed model that balances both accuracy and latency. We are able to achieve up to 7.8x speedup compared with TensorFlow-Lite with only minor accuracy loss. We present two types of BERT applications on mobile devices: Question Answering (QA) and Text Generation. Both can be executed in real-time with latency as low as 45ms. Videos for demonstrating the framework can be found on https://www.youtube.com/watch?v=_WIRvK_2PZI

Download Full-text

Abstract Semantic Associative Network Training: A Replication and Update of an Abstract Word Retrieval Therapy Program

American Journal of Speech-Language Pathology ◽

10.1044/2020_ajslp-19-00066 ◽

2020 ◽

Vol 29 (3) ◽

pp. 1574-1595

Author(s):

Chaleece W. Sandberg ◽

Teresa Gray

Keyword(s):

Concrete Word ◽

Abstract Word ◽

Distinct Pattern ◽

Word Retrieval ◽

Effect Sizes ◽

Associative Network ◽

Direct Training ◽

Network Training ◽

Abstract Words ◽

Concrete Words

Purpose We report on a study that replicates previous treatment studies using Abstract Semantic Associative Network Training (AbSANT), which was developed to help persons with aphasia improve their ability to retrieve abstract words, as well as thematically related concrete words. We hypothesized that previous results would be replicated; that is, when abstract words are trained using this protocol, improvement would be observed for both abstract and concrete words in the same context-category, but when concrete words are trained, no improvement for abstract words would be observed. We then frame the results of this study with the results of previous studies that used AbSANT to provide better evidence for the utility of this therapeutic technique. We also discuss proposed mechanisms of AbSANT. Method Four persons with aphasia completed one phase of concrete word training and one phase of abstract word training using the AbSANT protocol. Effect sizes were calculated for each word type for each phase. Effect sizes for this study are compared with the effect sizes from previous studies. Results As predicted, training abstract words resulted in both direct training and generalization effects, whereas training concrete words resulted in only direct training effects. The reported results are consistent across studies. Furthermore, when the data are compared across studies, there is a distinct pattern of the added benefit of training abstract words using AbSANT. Conclusion Treatment for word retrieval in aphasia is most often aimed at concrete words, despite the usefulness and pervasiveness of abstract words in everyday conversation. We show the utility of AbSANT as a means of improving not only abstract word retrieval but also concrete word retrieval and hope this evidence will help foster its application in clinical practice.

Download Full-text

A Neural Architecture for Visual Search

PsycEXTRA Dataset ◽

10.1037/e665412011-160 ◽

1992 ◽

Author(s):

William Ross ◽

Ennio Mingolla

Keyword(s):

Visual Search ◽

Neural Architecture

Download Full-text

Dynamic learning rate neural network training and composite structural damage detection

AIAA Journal ◽

10.2514/3.13701 ◽

1997 ◽

Vol 35 ◽

pp. 1522-1527

Author(s):

H. Luo ◽

S. Hanagud

Keyword(s):

Neural Network ◽

Damage Detection ◽

Structural Damage ◽

Learning Rate ◽

Neural Network Training ◽

Structural Damage Detection ◽

Dynamic Learning ◽

Network Training

Download Full-text

NetAdaptV2: Efficient Neural Architecture Search with Fast Super-Network Training and Architecture Optimization

HAO: Hardware-aware Neural Architecture Optimization for Efficient Inference

Designing deep neural networks for continual learning in an open world

An adaptive neural architecture optimization model for retinal disorder diagnosis on 3D medical images

An Effective Neural Architecture Optimization Algorithm for CNN based on Search Economics

Design of silicon brains in the nano-CMOS era: Spiking neurons, learning synapses and neural architecture optimization

Network training and architecture optimization by a recursive approach and a modified genetic algorithm

A Compression-Compilation Framework for On-mobile Real-time BERT Applications

Abstract Semantic Associative Network Training: A Replication and Update of an Abstract Word Retrieval Therapy Program

A Neural Architecture for Visual Search

Dynamic learning rate neural network training and composite structural damage detection

Export Citation Format