scholarly journals One Proxy Device Is Enough for Hardware-Aware Neural Architecture Search

Author(s):  
Bingqian Lu ◽  
Jianyi Yang ◽  
Weiwen Jiang ◽  
Yiyu Shi ◽  
Shaolei Ren

Convolutional neural networks (CNNs) are used in numerous real-world applications such as vision-based autonomous driving and video content analysis. To run CNN inference on various target devices, hardware-aware neural architecture search (NAS) is crucial. A key requirement of efficient hardware-aware NAS is the fast evaluation of inference latencies in order to rank different architectures. While building a latency predictor for each target device has been commonly used in state of the art, this is a very time-consuming process, lacking scalability in the presence of extremely diverse devices. In this work, we address the scalability challenge by exploiting latency monotonicity --- the architecture latency rankings on different devices are often correlated. When strong latency monotonicity exists, we can re-use architectures searched for one proxy device on new target devices, without losing optimality. In the absence of strong latency monotonicity, we propose an efficient proxy adaptation technique to significantly boost the latency monotonicity. Finally, we validate our approach and conduct experiments with devices of different platforms on multiple mainstream search spaces, including MobileNet-V2, MobileNet-V3, NAS-Bench-201, ProxylessNAS and FBNet. Our results highlight that, by using just one proxy device, we can find almost the same Pareto-optimal architectures as the existing per-device NAS, while avoiding the prohibitive cost of building a latency predictor for each device.

Author(s):  
Zhenhua Zhang ◽  
Leon Stenneth ◽  
Xiyuan Liu

The state-of-the-art traffic sign recognition (TSR) algorithms are designed to recognize the textual information of a traffic sign at over 95% accuracy. Even though, they are still not ready for complex roadworks near ramps. In real-world applications, when the vehicles are running on the freeway, they may misdetect the traffic signs for the ramp, which will become inaccurate feedback to the autonomous driving applications and result in unexpected speed reduction. The misdetection problems have drawn minimal attention in recent TSR studies. In this paper, it is proposed that the existing TSR studies should transform from the point-based sign recognition to path-based sign learning. In the proposed pipeline, the confidence of the TSR observations from normal vehicles can be increased by clustering and location adjustment. A supervised learning model is employed to classify the clustered learned signs and complement their path information. Test drives are conducted in 12 European countries to calibrate the models and validate the path information of the learned sign. After model implementation, the path accuracy over 1,000 learned signs can be increased from 75.04% to 89.80%. This study proves the necessity of the path-based TSR studies near freeway ramps and the proposed pipeline demonstrates a good utility and broad applicability for sensor-based autonomous vehicle applications.


Author(s):  
Yuqi Huo ◽  
Mingyu Ding ◽  
Haoyu Lu ◽  
Ziyuan Huang ◽  
Mingqian Tang ◽  
...  

This paper proposes a novel pretext task for self-supervised video representation learning by exploiting spatiotemporal continuity in videos. It is motivated by the fact that videos are spatiotemporal by nature and a representation learned by detecting spatiotemporal continuity/discontinuity is thus beneficial for downstream video content analysis tasks. A natural choice of such a pretext task is to construct spatiotemporal (3D) jigsaw puzzles and learn to solve them. However, as we demonstrate in the experiments, this task turns out to be intractable. We thus propose Constrained Spatiotemporal Jigsaw (CSJ) whereby the 3D jigsaws are formed in a constrained manner to ensure that large continuous spatiotemporal cuboids exist. This provides sufficient cues for the model to reason about the continuity. Instead of solving them directly, which could still be extremely hard, we carefully design four surrogate tasks that are more solvable. The four tasks aim to learn representations sensitive to spatiotemporal continuity at both the local and global levels. Extensive experiments show that our CSJ achieves state-of-the-art on various benchmarks.


2021 ◽  
Vol 54 (6) ◽  
pp. 1-35
Author(s):  
Ninareh Mehrabi ◽  
Fred Morstatter ◽  
Nripsuta Saxena ◽  
Kristina Lerman ◽  
Aram Galstyan

With the widespread use of artificial intelligence (AI) systems and applications in our everyday lives, accounting for fairness has gained significant importance in designing and engineering of such systems. AI systems can be used in many sensitive environments to make important and life-changing decisions; thus, it is crucial to ensure that these decisions do not reflect discriminatory behavior toward certain groups or populations. More recently some work has been developed in traditional machine learning and deep learning that address such challenges in different subdomains. With the commercialization of these systems, researchers are becoming more aware of the biases that these applications can contain and are attempting to address them. In this survey, we investigated different real-world applications that have shown biases in various ways, and we listed different sources of biases that can affect AI applications. We then created a taxonomy for fairness definitions that machine learning researchers have defined to avoid the existing bias in AI systems. In addition to that, we examined different domains and subdomains in AI showing what researchers have observed with regard to unfair outcomes in the state-of-the-art methods and ways they have tried to address them. There are still many future directions and solutions that can be taken to mitigate the problem of bias in AI systems. We are hoping that this survey will motivate researchers to tackle these issues in the near future by observing existing work in their respective fields.


2006 ◽  
Vol 52 (3) ◽  
pp. 870-878 ◽  
Author(s):  
Jungong Han ◽  
D. Farin ◽  
P.H.N. de With ◽  
Weilun Lao

2020 ◽  
Vol 34 (05) ◽  
pp. 7700-7707
Author(s):  
G P Shrivatsa Bhargav ◽  
Michael Glass ◽  
Dinesh Garg ◽  
Shirish Shevade ◽  
Saswati Dana ◽  
...  

Research on the task of Reading Comprehension style Question Answering (RCQA) has gained momentum in recent years due to the emergence of human annotated datasets and associated leaderboards, for example CoQA, HotpotQA, SQuAD, TriviaQA, etc. While state-of-the-art has advanced considerably, there is still ample opportunity to advance it further on some important variants of the RCQA task. In this paper, we propose a novel deep neural architecture, called TAP (Translucent Answer Prediction), to identify answers and evidence (in the form of supporting facts) in an RCQA task requiring multi-hop reasoning. TAP comprises two loosely coupled networks – Local and Global Interaction eXtractor (LoGIX) and Answer Predictor (AP). LoGIX predicts supporting facts, whereas AP consumes these predicted supporting facts to predict the answer span. The novel design of LoGIX is inspired by two key design desiderata – local context and global interaction– that we identified by analyzing examples of multi-hop RCQA task. The loose coupling between LoGIX and the AP reveals the set of sentences used by the AP in predicting an answer. Therefore, answer predictions of TAP can be interpreted in a translucent manner. TAP offers state-of-the-art performance on the HotpotQA (Yang et al. 2018) dataset – an apt dataset for multi-hop RCQA task – as it occupies Rank-1 on its leaderboard (https://hotpotqa.github.io/) at the time of submission.


2019 ◽  
Author(s):  
Mehrdad Shoeiby ◽  
Mohammad Ali Armin ◽  
Sadegh Aliakbarian ◽  
Saeed Anwar ◽  
Lars petersson

<div>Advances in the design of multi-spectral cameras have</div><div>led to great interests in a wide range of applications, from</div><div>astronomy to autonomous driving. However, such cameras</div><div>inherently suffer from a trade-off between the spatial and</div><div>spectral resolution. In this paper, we propose to address</div><div>this limitation by introducing a novel method to carry out</div><div>super-resolution on raw mosaic images, multi-spectral or</div><div>RGB Bayer, captured by modern real-time single-shot mo-</div><div>saic sensors. To this end, we design a deep super-resolution</div><div>architecture that benefits from a sequential feature pyramid</div><div>along the depth of the network. This, in fact, is achieved</div><div>by utilizing a convolutional LSTM (ConvLSTM) to learn the</div><div>inter-dependencies between features at different receptive</div><div>fields. Additionally, by investigating the effect of different</div><div>attention mechanisms in our framework, we show that a</div><div>ConvLSTM inspired module is able to provide superior at-</div><div>tention in our context. Our extensive experiments and anal-</div><div>yses evidence that our approach yields significant super-</div><div>resolution quality, outperforming current state-of-the-art</div><div>mosaic super-resolution methods on both Bayer and multi-</div><div>spectral images. Additionally, to the best of our knowledge,</div><div>our method is the first specialized method to super-resolve</div><div>mosaic images, whether it be multi-spectral or Bayer.</div><div><br></div>


Sign in / Sign up

Export Citation Format

Share Document