Deep Learning and Modelling of Audio-, Visual-, and Multimodal Audio-Visual Data in Brain-Inspired SNN

Author(s):  
Nikola K. Kasabov
Keyword(s):  
Author(s):  
Asma Zahra ◽  
Mubeen Ghafoor ◽  
Kamran Munir ◽  
Ata Ullah ◽  
Zain Ul Abideen

AbstractSmart video surveillance helps to build more robust smart city environment. The varied angle cameras act as smart sensors and collect visual data from smart city environment and transmit it for further visual analysis. The transmitted visual data is required to be in high quality for efficient analysis which is a challenging task while transmitting videos on low capacity bandwidth communication channels. In latest smart surveillance cameras, high quality of video transmission is maintained through various video encoding techniques such as high efficiency video coding. However, these video coding techniques still provide limited capabilities and the demand of high-quality based encoding for salient regions such as pedestrians, vehicles, cyclist/motorcyclist and road in video surveillance systems is still not met. This work is a contribution towards building an efficient salient region-based surveillance framework for smart cities. The proposed framework integrates a deep learning-based video surveillance technique that extracts salient regions from a video frame without information loss, and then encodes it in reduced size. We have applied this approach in diverse case studies environments of smart city to test the applicability of the framework. The successful result in terms of bitrate 56.92%, peak signal to noise ratio 5.35 bd and SR based segmentation accuracy of 92% and 96% for two different benchmark datasets is the outcome of proposed work. Consequently, the generation of less computational region-based video data makes it adaptable to improve surveillance solution in Smart Cities.


2020 ◽  
Author(s):  
Yu Tian ◽  
Gaofeng pan ◽  
Mohamed-Slim Alouini

<div>Deep learning (DL) has seen great success in the computer vision (CV) field, and related techniques have been used in security, healthcare, remote sensing, and many other fields. As a parallel development, visual data has become universal in daily life, easily generated by ubiquitous low-cost cameras. Therefore, exploring DL-based CV may yield useful information about objects, such as their number, locations, distribution, motion, etc. Intuitively, DL-based CV can also facilitate and improve the designs of wireless communications, especially in dynamic network scenarios. However, so far, such work is rare in the literature. The primary purpose of this article, then, is to introduce ideas about applying DL-based CV in wireless communications to bring some novel degrees of freedom to both theoretical research and engineering applications. To illustrate how DL-based CV can be applied in wireless communications, an example of using a DL-based CV with a millimeter-wave (mmWave) system is given to realize optimal mmWave multiple-input and multiple-output (MIMO) beamforming in mobile scenarios. In this example, we propose a framework to predict future beam indices from previously observed beam indices and images of street views using ResNet, 3-dimensional ResNext, and a long short-term memory network. The experimental results show that our frameworks achieve much higher accuracy than the baseline method, and that visual data can significantly improve the performance of the MIMO beamforming system. Finally, we discuss the opportunities and challenges of applying DL-based CV in wireless communications.</div>


2020 ◽  
Author(s):  
Yu Tian ◽  
Gaofeng pan ◽  
Mohamed-Slim Alouini

<div>Deep learning (DL) has seen great success in the computer vision (CV) field, and related techniques have been used in security, healthcare, remote sensing, and many other fields. As a parallel development, visual data has become universal in daily life, easily generated by ubiquitous low-cost cameras. Therefore, exploring DL-based CV may yield useful information about objects, such as their number, locations, distribution, motion, etc. Intuitively, DL-based CV can also facilitate and improve the designs of wireless communications, especially in dynamic network scenarios. However, so far, such work is rare in the literature. The primary purpose of this article, then, is to introduce ideas about applying DL-based CV in wireless communications to bring some novel degrees of freedom to both theoretical research and engineering applications. To illustrate how DL-based CV can be applied in wireless communications, an example of using a DL-based CV with a millimeter-wave (mmWave) system is given to realize optimal mmWave multiple-input and multiple-output (MIMO) beamforming in mobile scenarios. In this example, we propose a framework to predict future beam indices from previously observed beam indices and images of street views using ResNet, 3-dimensional ResNext, and a long short-term memory network. The experimental results show that our frameworks achieve much higher accuracy than the baseline method, and that visual data can significantly improve the performance of the MIMO beamforming system. Finally, we discuss the opportunities and challenges of applying DL-based CV in wireless communications.</div>


Sign in / Sign up

Export Citation Format

Share Document