Accelerating multi-objective neural architecture search by random-weight evaluation

AbstractFor the goal of automated design of high-performance deep convolutional neural networks (CNNs), neural architecture search (NAS) methodology is becoming increasingly important for both academia and industries. Due to the costly stochastic gradient descent training of CNNs for performance evaluation, most existing NAS methods are computationally expensive for real-world deployments. To address this issue, we first introduce a new performance estimation metric, named random-weight evaluation (RWE) to quantify the quality of CNNs in a cost-efficient manner. Instead of fully training the entire CNN, the RWE only trains its last layer and leaves the remainders with randomly initialized weights, which results in a single network evaluation in seconds. Second, a complexity metric is adopted for multi-objective NAS to balance the model size and performance. Overall, our proposed method obtains a set of efficient models with state-of-the-art performance in two real-world search spaces. Then the results obtained on the CIFAR-10 dataset are transferred to the ImageNet dataset to validate the practicality of the proposed algorithm. Moreover, ablation studies on NAS-Bench-301 datasets reveal the effectiveness of the proposed RWE in estimating the performance compared to existing methods.

Download Full-text

DEVELOPMENT OF A HIGH PERFORMANCE MULTI-PHYSICS FINITE DIFFERENCE MODEL FOR USE IN A MONTE CARLO SIMULATION WITH REAL WORLD DISTRIBUTIONS

10.1615/tfec2017.cfd.017687 ◽

2017 ◽

Cited By ~ 1

Author(s):

Joseph R. VanderVeer

Keyword(s):

Monte Carlo Simulation ◽

Monte Carlo ◽

Finite Difference ◽

Real World ◽

High Performance ◽

Difference Model ◽

Finite Difference Model

Download Full-text

Multimodal biometric system using deep learning based on face and finger vein fusion

Journal of Intelligent & Fuzzy Systems ◽

10.3233/jifs-189762 ◽

2021 ◽

pp. 1-13

Author(s):

Shikhar Tyagi ◽

Bhavya Chawla ◽

Rupav Jain ◽

Smriti Srivastava

Keyword(s):

High Performance ◽

Recognition Accuracy ◽

Error Rates ◽

Facial Features ◽

Biometric System ◽

Deep Convolutional Neural Networks ◽

Finger Vein ◽

Biometric Systems ◽

Overall Performance ◽

Recognition Systems

Single biometric modalities like facial features and vein patterns despite being reliable characteristics show limitations that restrict them from offering high performance and robustness. Multimodal biometric systems have gained interest due to their ability to overcome the inherent limitations of the underlying single biometric modalities and generally have been shown to improve the overall performance for identification and recognition purposes. This paper proposes highly accurate and robust multimodal biometric identification as well as recognition systems based on fusion of face and finger vein modalities. The feature extraction for both face and finger vein is carried out by exploiting deep convolutional neural networks. The fusion process involves combining the extracted relevant features from the two modalities at score level. The experimental results over all considered public databases show a significant improvement in terms of identification and recognition accuracy as well as equal error rates.

Download Full-text

APENAS: An Asynchronous Parallel Evolution Based Multi-objective Neural Architecture Search

2020 IEEE Intl Conf on Parallel & Distributed Processing with Applications, Big Data & Cloud Computing, Sustainable Computing & Communications, Social Computing & Networking (ISPA/BDCloud/SocialCom/SustainCom) ◽

10.1109/ispa-bdcloud-socialcom-sustaincom51426.2020.00045 ◽

2020 ◽

Author(s):

Mengtao Hu ◽

Li Liu ◽

Wei Wang ◽

Yao Liu

Keyword(s):

Parallel Evolution ◽

Multi Objective ◽

Neural Architecture ◽

Asynchronous Parallel

Download Full-text

High Performance Parallel Stochastic Gradient Descent in Shared Memory

2016 IEEE International Parallel and Distributed Processing Symposium (IPDPS) ◽

10.1109/ipdps.2016.107 ◽

2016 ◽

Cited By ~ 8

Author(s):

Scott Sallinen ◽

Nadathur Satish ◽

Mikhail Smelyanskiy ◽

Samantika S. Sury ◽

Christopher Re

Keyword(s):

Shared Memory ◽

Gradient Descent ◽

High Performance ◽

Stochastic Gradient ◽

Stochastic Gradient Descent ◽

Parallel Stochastic Gradient Descent

Download Full-text

CFD Simulation of a Real World High-Performance Two Stroke Engine with Use of a Multidimensional Coupling Methodology

10.4271/2008-32-0042 ◽

2008 ◽

Cited By ~ 3

Author(s):

Dalibor Jajcevic ◽

Raimund A. Almbauer ◽

Stephan P. Schmidt ◽

Karl Glinsner

Keyword(s):

Real World ◽

High Performance ◽

Cfd Simulation ◽

Stroke Engine

Download Full-text

Multi-objective optimization of an Interior PM motor for a high-performance drive

2012 XXth International Conference on Electrical Machines ◽

10.1109/icelmach.2012.6349894 ◽

2012 ◽

Cited By ~ 11

Author(s):

Nicola Bianchi ◽

Dario Durello ◽

Emanuele Fornasiero

Keyword(s):

High Performance ◽

Multi Objective Optimization ◽

Multi Objective

Download Full-text

Revisiting the CompCars Dataset for Hierarchical Car Classification: New Annotations, Experiments, and Results

Sensors ◽

10.3390/s21020596 ◽

2021 ◽

Vol 21 (2) ◽

pp. 596

Author(s):

Marco Buzzelli ◽

Luca Segantin

Keyword(s):

Real World ◽

High Performance ◽

Ad Hoc ◽

Future Research ◽

Levels Of Detail ◽

Excellent Starting Point ◽

Starting Point ◽

Hierarchical Nature ◽

Multiple Levels ◽

Entire Dataset

We address the task of classifying car images at multiple levels of detail, ranging from the top-level car type, down to the specific car make, model, and year. We analyze existing datasets for car classification, and identify the CompCars as an excellent starting point for our task. We show that convolutional neural networks achieve an accuracy above 90% on the finest-level classification task. This high performance, however, is scarcely representative of real-world situations, as it is evaluated on a biased training/test split. In this work, we revisit the CompCars dataset by first defining a new training/test split, which better represents real-world scenarios by setting a more realistic baseline at 61% accuracy on the new test set. We also propagate the existing (but limited) type-level annotation to the entire dataset, and we finally provide a car-tight bounding box for each image, automatically defined through an ad hoc car detector. To evaluate this revisited dataset, we design and implement three different approaches to car classification, two of which exploit the hierarchical nature of car annotations. Our experiments show that higher-level classification in terms of car type positively impacts classification at a finer grain, now reaching 70% accuracy. The achieved performance constitutes a baseline benchmark for future research, and our enriched set of annotations is made available for public download.

Download Full-text

Multi-Task Learning for Multi-Objective Evolutionary Neural Architecture Search

2021 IEEE Congress on Evolutionary Computation (CEC) ◽

10.1109/cec45853.2021.9504721 ◽

2021 ◽

Author(s):

Ronghong Cai ◽

Jianping Luo

Keyword(s):

Multi Objective ◽

Neural Architecture ◽

Task Learning

Download Full-text

A JavaScript API for the Ice Sheet System Model: towards on online interactive model for the Cryosphere Community

10.5194/gmd-2016-179 ◽

2016 ◽

Author(s):

Eric Larour ◽

Daniel Cheng ◽

Gilberto Perez ◽

Justin Quinn ◽

Mathieu Morlighem ◽

...

Keyword(s):

High Performance ◽

Ice Sheet ◽

System Model ◽

Earth System ◽

Efficient Manner ◽

Post Process ◽

Software Model ◽

Science Community ◽

Deep Integration ◽

Client Side

Abstract. Earth System Models (ESMs) are becoming increasingly complex, requiring extensive knowledge and experience to deploy and use in an efficient manner. They run on high-performance architectures that are significantly different from the everyday environments that scientists use to pre and post-process results (i.e. MATLAB, Python). This results in models that are hard to use for non specialists, and that are increasingly specific in their application. It also makes them relatively inaccessible to the wider science community, not to mention to the general public. Here, we present a new software/model paradigm that attempts to bridge the gap between the science community and the complexity of ESMs, by developing a new JavaScript Application Program Interface (API) for the Ice Sheet System Model (ISSM). The aforementioned API allows Cryosphere Scientists to run ISSM on the client-side of a webpage, within the JavaScript environment. When combined with a Web server running ISSM (using a Python API), it enables the serving of ISSM computations in an easy and straightforward way. The deep integration and similarities between all the APIs in ISSM (MATLAB, Python, and now JavaScript) significantly shortens and simplifies the turnaround of state-of-the-art science runs and their use by the larger community. We demonstrate our approach via a new Virtual Earth System Laboratory (VESL) Web site.

Download Full-text

One Proxy Device Is Enough for Hardware-Aware Neural Architecture Search

Proceedings of the ACM on Measurement and Analysis of Computing Systems ◽

10.1145/3491046 ◽

2021 ◽

Vol 5 (3) ◽

pp. 1-34

Author(s):

Bingqian Lu ◽

Jianyi Yang ◽

Weiwen Jiang ◽

Yiyu Shi ◽

Shaolei Ren

Keyword(s):

State Of The Art ◽

Autonomous Driving ◽

Pareto Optimal ◽

Video Content ◽

Fast Evaluation ◽

Video Content Analysis ◽

Search Spaces ◽

Neural Architecture ◽

Real World Applications ◽

Prohibitive Cost

Convolutional neural networks (CNNs) are used in numerous real-world applications such as vision-based autonomous driving and video content analysis. To run CNN inference on various target devices, hardware-aware neural architecture search (NAS) is crucial. A key requirement of efficient hardware-aware NAS is the fast evaluation of inference latencies in order to rank different architectures. While building a latency predictor for each target device has been commonly used in state of the art, this is a very time-consuming process, lacking scalability in the presence of extremely diverse devices. In this work, we address the scalability challenge by exploiting latency monotonicity --- the architecture latency rankings on different devices are often correlated. When strong latency monotonicity exists, we can re-use architectures searched for one proxy device on new target devices, without losing optimality. In the absence of strong latency monotonicity, we propose an efficient proxy adaptation technique to significantly boost the latency monotonicity. Finally, we validate our approach and conduct experiments with devices of different platforms on multiple mainstream search spaces, including MobileNet-V2, MobileNet-V3, NAS-Bench-201, ProxylessNAS and FBNet. Our results highlight that, by using just one proxy device, we can find almost the same Pareto-optimal architectures as the existing per-device NAS, while avoiding the prohibitive cost of building a latency predictor for each device.

Download Full-text