scholarly journals Applying Deep-Learning-Based Computer Vision to Wireless Communications: Methodologies,Opportunities, and Challenges

Author(s):  
Yu Tian ◽  
Gaofeng pan ◽  
Mohamed-Slim Alouini

<div>Deep learning (DL) has obtained great success in computer vision (CV) field, and the related techniques have been widely used in security, healthcare, remote sensing, etc. On the other hand, visual data is universal in our daily life, which is easily generated by prevailing but low-cost cameras. Therefore, DL-based CV can be explored to obtain and forecast some useful information about the objects, e.g., the number, locations, distribution, motion, etc. Intuitively, DL-based CV can facilitate and improve the designs of wireless communications, especially in dynamic network scenarios. However, so far, it is rare to see such kind of works in the existing literature. Then, the primary purpose of this article is to introduce ideas of applying DL-based CV in wireless communications to bring some novel degrees of freedom for both theoretical researches and engineering applications. To illustrate how DL-based CV can be applied in wireless communications, an example of using DL-based CV to millimeter wave (mmWave) system is given to realize optimal mmWave multiple-input and multiple-output (MIMO) beamforming in mobile scenarios. In this example, we proposed a framework to predict the future beam indices from the previously-observed beam indices and images of street views by using ResNet, 3-dimensional ResNext, and long short term memory network. Experimental results show that our frameworks can achieve much higher accuracy than the baseline method, and visual data can help significantly improve the performance of MIMO beamforming system. Finally, we discuss the opportunities and challenges of applying DL-based CV in wireless communications.</div>

2020 ◽  
Author(s):  
Yu Tian ◽  
Gaofeng pan ◽  
Mohamed-Slim Alouini

<div>Deep learning (DL) has seen great success in the computer vision (CV) field, and related techniques have been used in security, healthcare, remote sensing, and many other fields. As a parallel development, visual data has become universal in daily life, easily generated by ubiquitous low-cost cameras. Therefore, exploring DL-based CV may yield useful information about objects, such as their number, locations, distribution, motion, etc. Intuitively, DL-based CV can also facilitate and improve the designs of wireless communications, especially in dynamic network scenarios. However, so far, such work is rare in the literature. The primary purpose of this article, then, is to introduce ideas about applying DL-based CV in wireless communications to bring some novel degrees of freedom to both theoretical research and engineering applications. To illustrate how DL-based CV can be applied in wireless communications, an example of using a DL-based CV with a millimeter-wave (mmWave) system is given to realize optimal mmWave multiple-input and multiple-output (MIMO) beamforming in mobile scenarios. In this example, we propose a framework to predict future beam indices from previously observed beam indices and images of street views using ResNet, 3-dimensional ResNext, and a long short-term memory network. The experimental results show that our frameworks achieve much higher accuracy than the baseline method, and that visual data can significantly improve the performance of the MIMO beamforming system. Finally, we discuss the opportunities and challenges of applying DL-based CV in wireless communications.</div>


2020 ◽  
Author(s):  
Yu Tian ◽  
Gaofeng pan ◽  
Mohamed-Slim Alouini

<div>Deep learning (DL) has seen great success in the computer vision (CV) field, and related techniques have been used in security, healthcare, remote sensing, and many other fields. As a parallel development, visual data has become universal in daily life, easily generated by ubiquitous low-cost cameras. Therefore, exploring DL-based CV may yield useful information about objects, such as their number, locations, distribution, motion, etc. Intuitively, DL-based CV can also facilitate and improve the designs of wireless communications, especially in dynamic network scenarios. However, so far, such work is rare in the literature. The primary purpose of this article, then, is to introduce ideas about applying DL-based CV in wireless communications to bring some novel degrees of freedom to both theoretical research and engineering applications. To illustrate how DL-based CV can be applied in wireless communications, an example of using a DL-based CV with a millimeter-wave (mmWave) system is given to realize optimal mmWave multiple-input and multiple-output (MIMO) beamforming in mobile scenarios. In this example, we propose a framework to predict future beam indices from previously observed beam indices and images of street views using ResNet, 3-dimensional ResNext, and a long short-term memory network. The experimental results show that our frameworks achieve much higher accuracy than the baseline method, and that visual data can significantly improve the performance of the MIMO beamforming system. Finally, we discuss the opportunities and challenges of applying DL-based CV in wireless communications.</div>


Sensors ◽  
2021 ◽  
Vol 21 (2) ◽  
pp. 343
Author(s):  
Kim Bjerge ◽  
Jakob Bonde Nielsen ◽  
Martin Videbæk Sepstrup ◽  
Flemming Helsing-Nielsen ◽  
Toke Thomas Høye

Insect monitoring methods are typically very time-consuming and involve substantial investment in species identification following manual trapping in the field. Insect traps are often only serviced weekly, resulting in low temporal resolution of the monitoring data, which hampers the ecological interpretation. This paper presents a portable computer vision system capable of attracting and detecting live insects. More specifically, the paper proposes detection and classification of species by recording images of live individuals attracted to a light trap. An Automated Moth Trap (AMT) with multiple light sources and a camera was designed to attract and monitor live insects during twilight and night hours. A computer vision algorithm referred to as Moth Classification and Counting (MCC), based on deep learning analysis of the captured images, tracked and counted the number of insects and identified moth species. Observations over 48 nights resulted in the capture of more than 250,000 images with an average of 5675 images per night. A customized convolutional neural network was trained on 2000 labeled images of live moths represented by eight different classes, achieving a high validation F1-score of 0.93. The algorithm measured an average classification and tracking F1-score of 0.71 and a tracking detection rate of 0.79. Overall, the proposed computer vision system and algorithm showed promising results as a low-cost solution for non-destructive and automatic monitoring of moths.


Sensors ◽  
2020 ◽  
Vol 20 (9) ◽  
pp. 2475
Author(s):  
Liang Wang ◽  
Jianliang Ai ◽  
Li Zhang ◽  
Zhenlin Xing

In recent years, a rising number of incidents between Unmanned Aerial Vehicles (UAVs) and planes have been reported at airports and airfields. A design scheme for an airport obstacle-free zone monitoring UAV system based on computer vision is proposed. The system integrates the functions of identification, tracking, and expelling and is mainly used for low-cost control of balloon airborne objects and small aircrafts. First, a quadcopter dynamic model and 2-Degrees of Freedom (2-DOF) Pan/Tilt/Zoom (PTZ) model are analyzed, and an attitude back-stepping controller based on disturbance compensation is designed. Second, a low and slow small-target self-identification and tracking technology is constructed against a complex environment. Based on the You Only Look Once (YOLO) and Kernel Correlation Filter (KCF) algorithms, an autonomous target recognition and high-speed tracking plan with great robustness and high reliability is designed. Third, a PTZ controller and automatic aiming strategy based on Anti-Windup Proportional Integral Derivative (PID) algorithm is designed, and a simplified, automatic-aiming expelling device, the environmentally friendly gel ball blaster, which features high speed and high accuracy, is built. The feasibility and stability of the project can be verified through prototype experiments.


2021 ◽  
Vol 11 (24) ◽  
pp. 12059
Author(s):  
Giulio Siracusano ◽  
Francesca Garescì ◽  
Giovanni Finocchio ◽  
Riccardo Tomasello ◽  
Francesco Lamonaca ◽  
...  

In modern building infrastructures, the chance to devise adaptive and unsupervised data-driven structural health monitoring (SHM) systems is gaining in popularity. This is due to the large availability of big data from low-cost sensors with communication capabilities and advanced modeling tools such as deep learning. A promising method suitable for smart SHM is the analysis of acoustic emissions (AEs), i.e., ultrasonic waves generated by internal ruptures of the concrete when it is stressed. The advantage in respect to traditional ultrasonic measurement methods is the absence of the emitter and the suitability to implement continuous monitoring. The main purpose of this paper is to combine deep neural networks with bidirectional long short term memory and advanced statistical analysis involving instantaneous frequency and spectral kurtosis to develop an accurate classification tool for tensile, shear and mixed modes originated from AE events (cracks). We investigated effective event descriptors to capture the unique characteristics from the different types of modes. Tests on experimental results confirm that this method achieves promising classification among different crack events and can impact on the design of the future of SHM technologies. This approach is effective to classify incipient damages with 92% of accuracy, which is advantageous to plan maintenance.


Author(s):  
Frederick Mun ◽  
Ahnryul Choi

Abstract Background Foot pressure distribution can be used as a quantitative parameter for evaluating anatomical deformity of the foot and for diagnosing and treating pathological gait, falling, and pressure sores in diabetes. The objective of this study was to propose a deep learning model that could predict pressure distribution of the whole foot based on information obtained from a small number of pressure sensors in an insole. Methods Twenty young and twenty older adults walked a straight pathway at a preferred speed with a Pedar-X system in anti-skid socks. A long short-term memory (LSTM) model was used to predict foot pressure distribution. Pressure values of nine major sensors and the remaining 90 sensors in a Pedar-X system were used as input and output for the model, respectively. The performance of the proposed LSTM structure was compared with that of a traditionally used adaptive neuro-fuzzy interference system (ANFIS). A low-cost insole system consisting of a small number of pressure sensors was fabricated. A gait experiment was additionally performed with five young and five older adults, excluding subjects who were used to construct models. The Pedar-X system placed parallelly on top of the insole prototype developed in this study was in anti-skid socks. Sensor values from a low-cost insole prototype were used as input of the LSTM model. The accuracy of the model was evaluated by applying a leave-one-out cross-validation. Results Correlation coefficient and relative root mean square error (RMSE) of the LSTM model were 0.98 (0.92 ~ 0.99) and 7.9 ± 2.3%, respectively, higher than those of the ANFIS model. Additionally, the usefulness of the proposed LSTM model for fabricating a low-cost insole prototype with a small number of sensors was confirmed, showing a correlation coefficient of 0.63 to 0.97 and a relative RMSE of 12.7 ± 7.4%. Conclusions This model can be used as an algorithm to develop a low-cost portable smart insole system to monitor age-related physiological and anatomical alterations in foot. This model has the potential to evaluate clinical rehabilitation status of patients with pathological gait, falling, and various foot pathologies when more data of patients with various diseases are accumulated for training.


2019 ◽  
Vol 133 ◽  
pp. 1158-1166 ◽  
Author(s):  
Jose A. Carballo ◽  
Javier Bonilla ◽  
Manuel Berenguel ◽  
Jesús Fernández-Reche ◽  
Ginés García

2020 ◽  
Author(s):  
Cedar Warman ◽  
Christopher M. Sullivan ◽  
Justin Preece ◽  
Michaela E. Buchanan ◽  
Zuzana Vejlupkova ◽  
...  

AbstractHigh-throughput phenotyping systems are powerful, dramatically changing our ability to document, measure, and detect biological phenomena. Here, we describe a cost-effective combination of a custom-built imaging platform and deep-learning-based computer vision pipeline. A minimal version of the maize ear scanner was built with low-cost and readily available parts. The scanner rotates a maize ear while a cellphone or digital camera captures a video of the surface of the ear. Videos are then digitally flattened into two-dimensional ear projections. Segregating GFP and anthocyanin kernel phenotype are clearly distinguishable in ear projections, and can be manually annotated using image analysis software. Increased throughput was attained by designing and implementing an automated kernel counting system using transfer learning and a deep learning object detection model. The computer vision model was able to rapidly assess over 390,000 kernels, identifying male-specific transmission defects across a wide range of GFP-marked mutant alleles. This includes a previously undescribed defect putatively associated with mutation of Zm00001d002824, a gene predicted to encode a vacuolar processing enzyme (VPE). We show that by using this system, the quantification of transmission data and other ear phenotypes can be accelerated and scaled to generate large datasets for robust analyses.One sentence summaryA maize ear phenotyping system built from commonly available parts creates images of the surface of ears and identifies kernel phenotypes with a deep-learning-based computer vision pipeline.


2020 ◽  
Vol 7 (1) ◽  
pp. 2-3
Author(s):  
Shadi Saleh

Deep learning and machine learning innovations are at the core of the ongoing revolution in Artificial Intelligence for the interpretation and analysis of multimedia data. The convergence of large-scale datasets and more affordable Graphics Processing Unit (GPU) hardware has enabled the development of neural networks for data analysis problems that were previously handled by traditional handcrafted features. Several deep learning architectures such as Convolutional Neural Networks (CNNs), Recurrent Neural Networks (RNNs), Long Short Term Memory (LSTM)/Gated Recurrent Unit (GRU), Deep Believe Networks (DBN), and Deep Stacking Networks (DSNs) have been used with new open source software and libraries options to shape an entirely new scenario in computer vision processing.


Sign in / Sign up

Export Citation Format

Share Document