scholarly journals Applying Deep-Learning-Based Computer Vision to Wireless Communications: Methodologies,Opportunities, and Challenges

Author(s):  
Yu Tian ◽  
Gaofeng pan ◽  
Mohamed-Slim Alouini

<div>Deep learning (DL) has seen great success in the computer vision (CV) field, and related techniques have been used in security, healthcare, remote sensing, and many other fields. As a parallel development, visual data has become universal in daily life, easily generated by ubiquitous low-cost cameras. Therefore, exploring DL-based CV may yield useful information about objects, such as their number, locations, distribution, motion, etc. Intuitively, DL-based CV can also facilitate and improve the designs of wireless communications, especially in dynamic network scenarios. However, so far, such work is rare in the literature. The primary purpose of this article, then, is to introduce ideas about applying DL-based CV in wireless communications to bring some novel degrees of freedom to both theoretical research and engineering applications. To illustrate how DL-based CV can be applied in wireless communications, an example of using a DL-based CV with a millimeter-wave (mmWave) system is given to realize optimal mmWave multiple-input and multiple-output (MIMO) beamforming in mobile scenarios. In this example, we propose a framework to predict future beam indices from previously observed beam indices and images of street views using ResNet, 3-dimensional ResNext, and a long short-term memory network. The experimental results show that our frameworks achieve much higher accuracy than the baseline method, and that visual data can significantly improve the performance of the MIMO beamforming system. Finally, we discuss the opportunities and challenges of applying DL-based CV in wireless communications.</div>

2020 ◽  
Author(s):  
Yu Tian ◽  
Gaofeng pan ◽  
Mohamed-Slim Alouini

<div>Deep learning (DL) has seen great success in the computer vision (CV) field, and related techniques have been used in security, healthcare, remote sensing, and many other fields. As a parallel development, visual data has become universal in daily life, easily generated by ubiquitous low-cost cameras. Therefore, exploring DL-based CV may yield useful information about objects, such as their number, locations, distribution, motion, etc. Intuitively, DL-based CV can also facilitate and improve the designs of wireless communications, especially in dynamic network scenarios. However, so far, such work is rare in the literature. The primary purpose of this article, then, is to introduce ideas about applying DL-based CV in wireless communications to bring some novel degrees of freedom to both theoretical research and engineering applications. To illustrate how DL-based CV can be applied in wireless communications, an example of using a DL-based CV with a millimeter-wave (mmWave) system is given to realize optimal mmWave multiple-input and multiple-output (MIMO) beamforming in mobile scenarios. In this example, we propose a framework to predict future beam indices from previously observed beam indices and images of street views using ResNet, 3-dimensional ResNext, and a long short-term memory network. The experimental results show that our frameworks achieve much higher accuracy than the baseline method, and that visual data can significantly improve the performance of the MIMO beamforming system. Finally, we discuss the opportunities and challenges of applying DL-based CV in wireless communications.</div>


2020 ◽  
Author(s):  
Yu Tian ◽  
Gaofeng pan ◽  
Mohamed-Slim Alouini

<div>Deep learning (DL) has obtained great success in computer vision (CV) field, and the related techniques have been widely used in security, healthcare, remote sensing, etc. On the other hand, visual data is universal in our daily life, which is easily generated by prevailing but low-cost cameras. Therefore, DL-based CV can be explored to obtain and forecast some useful information about the objects, e.g., the number, locations, distribution, motion, etc. Intuitively, DL-based CV can facilitate and improve the designs of wireless communications, especially in dynamic network scenarios. However, so far, it is rare to see such kind of works in the existing literature. Then, the primary purpose of this article is to introduce ideas of applying DL-based CV in wireless communications to bring some novel degrees of freedom for both theoretical researches and engineering applications. To illustrate how DL-based CV can be applied in wireless communications, an example of using DL-based CV to millimeter wave (mmWave) system is given to realize optimal mmWave multiple-input and multiple-output (MIMO) beamforming in mobile scenarios. In this example, we proposed a framework to predict the future beam indices from the previously-observed beam indices and images of street views by using ResNet, 3-dimensional ResNext, and long short term memory network. Experimental results show that our frameworks can achieve much higher accuracy than the baseline method, and visual data can help significantly improve the performance of MIMO beamforming system. Finally, we discuss the opportunities and challenges of applying DL-based CV in wireless communications.</div>


Sensors ◽  
2021 ◽  
Vol 21 (2) ◽  
pp. 343
Author(s):  
Kim Bjerge ◽  
Jakob Bonde Nielsen ◽  
Martin Videbæk Sepstrup ◽  
Flemming Helsing-Nielsen ◽  
Toke Thomas Høye

Insect monitoring methods are typically very time-consuming and involve substantial investment in species identification following manual trapping in the field. Insect traps are often only serviced weekly, resulting in low temporal resolution of the monitoring data, which hampers the ecological interpretation. This paper presents a portable computer vision system capable of attracting and detecting live insects. More specifically, the paper proposes detection and classification of species by recording images of live individuals attracted to a light trap. An Automated Moth Trap (AMT) with multiple light sources and a camera was designed to attract and monitor live insects during twilight and night hours. A computer vision algorithm referred to as Moth Classification and Counting (MCC), based on deep learning analysis of the captured images, tracked and counted the number of insects and identified moth species. Observations over 48 nights resulted in the capture of more than 250,000 images with an average of 5675 images per night. A customized convolutional neural network was trained on 2000 labeled images of live moths represented by eight different classes, achieving a high validation F1-score of 0.93. The algorithm measured an average classification and tracking F1-score of 0.71 and a tracking detection rate of 0.79. Overall, the proposed computer vision system and algorithm showed promising results as a low-cost solution for non-destructive and automatic monitoring of moths.


2021 ◽  
Vol 11 (24) ◽  
pp. 12059
Author(s):  
Giulio Siracusano ◽  
Francesca Garescì ◽  
Giovanni Finocchio ◽  
Riccardo Tomasello ◽  
Francesco Lamonaca ◽  
...  

In modern building infrastructures, the chance to devise adaptive and unsupervised data-driven structural health monitoring (SHM) systems is gaining in popularity. This is due to the large availability of big data from low-cost sensors with communication capabilities and advanced modeling tools such as deep learning. A promising method suitable for smart SHM is the analysis of acoustic emissions (AEs), i.e., ultrasonic waves generated by internal ruptures of the concrete when it is stressed. The advantage in respect to traditional ultrasonic measurement methods is the absence of the emitter and the suitability to implement continuous monitoring. The main purpose of this paper is to combine deep neural networks with bidirectional long short term memory and advanced statistical analysis involving instantaneous frequency and spectral kurtosis to develop an accurate classification tool for tensile, shear and mixed modes originated from AE events (cracks). We investigated effective event descriptors to capture the unique characteristics from the different types of modes. Tests on experimental results confirm that this method achieves promising classification among different crack events and can impact on the design of the future of SHM technologies. This approach is effective to classify incipient damages with 92% of accuracy, which is advantageous to plan maintenance.


Author(s):  
Frederick Mun ◽  
Ahnryul Choi

Abstract Background Foot pressure distribution can be used as a quantitative parameter for evaluating anatomical deformity of the foot and for diagnosing and treating pathological gait, falling, and pressure sores in diabetes. The objective of this study was to propose a deep learning model that could predict pressure distribution of the whole foot based on information obtained from a small number of pressure sensors in an insole. Methods Twenty young and twenty older adults walked a straight pathway at a preferred speed with a Pedar-X system in anti-skid socks. A long short-term memory (LSTM) model was used to predict foot pressure distribution. Pressure values of nine major sensors and the remaining 90 sensors in a Pedar-X system were used as input and output for the model, respectively. The performance of the proposed LSTM structure was compared with that of a traditionally used adaptive neuro-fuzzy interference system (ANFIS). A low-cost insole system consisting of a small number of pressure sensors was fabricated. A gait experiment was additionally performed with five young and five older adults, excluding subjects who were used to construct models. The Pedar-X system placed parallelly on top of the insole prototype developed in this study was in anti-skid socks. Sensor values from a low-cost insole prototype were used as input of the LSTM model. The accuracy of the model was evaluated by applying a leave-one-out cross-validation. Results Correlation coefficient and relative root mean square error (RMSE) of the LSTM model were 0.98 (0.92 ~ 0.99) and 7.9 ± 2.3%, respectively, higher than those of the ANFIS model. Additionally, the usefulness of the proposed LSTM model for fabricating a low-cost insole prototype with a small number of sensors was confirmed, showing a correlation coefficient of 0.63 to 0.97 and a relative RMSE of 12.7 ± 7.4%. Conclusions This model can be used as an algorithm to develop a low-cost portable smart insole system to monitor age-related physiological and anatomical alterations in foot. This model has the potential to evaluate clinical rehabilitation status of patients with pathological gait, falling, and various foot pathologies when more data of patients with various diseases are accumulated for training.


2019 ◽  
Vol 133 ◽  
pp. 1158-1166 ◽  
Author(s):  
Jose A. Carballo ◽  
Javier Bonilla ◽  
Manuel Berenguel ◽  
Jesús Fernández-Reche ◽  
Ginés García

2020 ◽  
Author(s):  
Cedar Warman ◽  
Christopher M. Sullivan ◽  
Justin Preece ◽  
Michaela E. Buchanan ◽  
Zuzana Vejlupkova ◽  
...  

AbstractHigh-throughput phenotyping systems are powerful, dramatically changing our ability to document, measure, and detect biological phenomena. Here, we describe a cost-effective combination of a custom-built imaging platform and deep-learning-based computer vision pipeline. A minimal version of the maize ear scanner was built with low-cost and readily available parts. The scanner rotates a maize ear while a cellphone or digital camera captures a video of the surface of the ear. Videos are then digitally flattened into two-dimensional ear projections. Segregating GFP and anthocyanin kernel phenotype are clearly distinguishable in ear projections, and can be manually annotated using image analysis software. Increased throughput was attained by designing and implementing an automated kernel counting system using transfer learning and a deep learning object detection model. The computer vision model was able to rapidly assess over 390,000 kernels, identifying male-specific transmission defects across a wide range of GFP-marked mutant alleles. This includes a previously undescribed defect putatively associated with mutation of Zm00001d002824, a gene predicted to encode a vacuolar processing enzyme (VPE). We show that by using this system, the quantification of transmission data and other ear phenotypes can be accelerated and scaled to generate large datasets for robust analyses.One sentence summaryA maize ear phenotyping system built from commonly available parts creates images of the surface of ears and identifies kernel phenotypes with a deep-learning-based computer vision pipeline.


2020 ◽  
Vol 7 (1) ◽  
pp. 2-3
Author(s):  
Shadi Saleh

Deep learning and machine learning innovations are at the core of the ongoing revolution in Artificial Intelligence for the interpretation and analysis of multimedia data. The convergence of large-scale datasets and more affordable Graphics Processing Unit (GPU) hardware has enabled the development of neural networks for data analysis problems that were previously handled by traditional handcrafted features. Several deep learning architectures such as Convolutional Neural Networks (CNNs), Recurrent Neural Networks (RNNs), Long Short Term Memory (LSTM)/Gated Recurrent Unit (GRU), Deep Believe Networks (DBN), and Deep Stacking Networks (DSNs) have been used with new open source software and libraries options to shape an entirely new scenario in computer vision processing.


2020 ◽  
Author(s):  
Corneliu Arsene

Effective and powerful methods for denoising real electrocardiogram (ECG) signals are important for wearable sensors and devices. Deep Learning (DL) models have been used extensively in image processing and other domains with great success but only very recently have been used in processing ECG signals. This paper presents several DL models namely Convolutional Neural Networks (CNNs), Long Short-Term Memory (LSTM), Restricted Boltzmann Machine (RBM) together with the more conventional filtering methods (low pass filtering, high pass filtering, Notch filtering) and the standard wavelet-based technique for denoising EEG signals. These methods are trained, tested and evaluated on different synthetic and real ECG datasets taken from the MIT PhysioNet database and for different simulation conditions (i.e. various lengths of the ECG signals, single or multiple records). The results show the CNN model is a performant model that can be used for off-line denoising ECG applications where it is satisfactory to train on a clean part of an ECG signal from an ECG record, and then to test on the same ECG signal, which would have some high level of noise added to it. However, for real-time applications or near-real time applications, this task becomes more cumbersome, as the clean part of an ECG signal is very probable to be very limited in size. Therefore the solution put forth in this work is to train a CNN model on 1 second ECG noisy artificial multiple heartbeat data (i.e. ECG at effort), which was generated in a first instance based on few sequences of real signal heartbeat ECG data (i.e. ECG at rest). Afterwards it would be possible to use the trained CNN model in real life situations to denoise the ECG signal. This corresponds also to reality, where usually the human is put at rest and the ECG is recorded and then the same human is asked to do some physical exercises and the ECG is recorded at effort. The quality of results is assessed visually but also by using the Root Mean Squared (RMS) and the Signal to Noise Ratio (SNR) measures. All CNN models used an NVIDIA TITAN V Graphical Processing Unit (GPU) with 12 GB RAM, which reduces drastically the computational times. Finally, as an element of novelty, the paper presents also a Design of Experiment (DoE) study which intends to determine the optimal structure of a CNN model, which type of study has not been seen in the literature before.


Sign in / Sign up

Export Citation Format

Share Document