Drones are becoming increasingly popular not only for recreational purposes but also in a variety of applications in engineering, disaster management, logistics, securing airports, and others. In addition to their useful applications, an alarming concern regarding physical infrastructure security, safety, and surveillance at airports has arisen due to the potential of their use in malicious activities. In recent years, there have been many reports of the unauthorized use of various types of drones at airports and the disruption of airline operations. To address this problem, this study proposes a novel deep learning-based method for the efficient detection and recognition of two types of drones and birds. Evaluation of the proposed approach with the prepared image dataset demonstrates better efficiency compared to existing detection systems in the literature. Furthermore, drones are often confused with birds because of their physical and behavioral similarity. The proposed method is not only able to detect the presence or absence of drones in an area but also to recognize and distinguish between two types of drones, as well as distinguish them from birds. The dataset used in this work to train the network consists of 10,000 visible images containing two types of drones as multirotors, helicopters, and also birds. The proposed deep learning method can directly detect and recognize two types of drones and distinguish them from birds with an accuracy of 83%, mAP of 84%, and IoU of 81%. The values of average recall, average accuracy, and average F1-score were also reported as 84%, 83%, and 83%, respectively, in three classes.
This article shows how to create a robust thermal face recognition system based on the FaceNet architecture. We propose a method for generating thermal images to create a thermal face database with six different attributes (frown, glasses, rotation, normal, vocal, and smile) based on various deep learning models. First, we use StyleCLIP, which oversees manipulating the latent space of the input visible image to add the desired attributes to the visible face. Second, we use the GANs N’ Roses (GNR) model, a multimodal image-to-image framework. It uses maps of style and content to generate thermal imaging from visible images, using generative adversarial approaches. Using the proposed generator system, we create a database of synthetic thermal faces composed of more than 100k images corresponding to 3227 individuals. When trained and tested using the synthetic database, the Thermal-FaceNet model obtained a 99.98% accuracy. Furthermore, when tested with a real database, the accuracy was more than 98%, validating the proposed thermal images generator system.
An efficient method for the infrared and visible image fusion is presented using truncated Huber penalty function smoothing and visual saliency based threshold optimization. The method merges complementary information from multimodality source images into a more informative composite image in two-scale domain, in which the significant objects/regions are highlighted and rich feature information is preserved. Firstly, source images are decomposed into two-scale image representations, namely, the approximate and residual layers, using truncated Huber penalty function smoothing. Benefiting from the edge- and structure-preserving characteristics, the significant objects and regions in the source images are effectively extracted without halo artifacts around the edges. Secondly, a visual saliency based threshold optimization fusion rule is designed to fuse the approximate layers aiming to highlight the salient targets in infrared images and remain the high-intensity regions in visible images. The sparse representation based fusion rule is adopted to fuse the residual layers with the goal of acquiring rich detail texture information. Finally, combining the fused approximate and residual layers reconstructs the fused image with more natural visual effects. Sufficient experimental results demonstrate that the proposed method can achieve comparable or superior performances compared with several state-of-the-art fusion methods in visual results and objective assessments.
As a powerful technique to merge complementary information of original images, infrared (IR) and visible image fusion approaches are widely used in surveillance, target detecting, tracking, and biological recognition, etc. In this paper, an efficient IR and visible image fusion method is proposed to simultaneously enhance the significant targets/regions in all source images and preserve rich background details in visible images. The multi-scale representation based on the fast global smoother is firstly used to decompose source images into the base and detail layers, aiming to extract the salient structure information and suppress the halos around the edges. Then, a target-enhanced parallel Gaussian fuzzy logic-based fusion rule is proposed to merge the base layers, which can avoid the brightness loss and highlight significant targets/regions. In addition, the visual saliency map-based fusion rule is designed to merge the detail layers with the purpose of obtaining rich details. Finally, the fused image is reconstructed. Extensive experiments are conducted on 21 image pairs and a Nato-camp sequence (32 image pairs) to verify the effectiveness and superiority of the proposed method. Compared with several state-of-the-art methods, experimental results demonstrate that the proposed method can achieve more competitive or superior performances according to both the visual results and objective evaluation.
Passive millimeter wave has been employed in security inspection owing to a good penetrability to clothing and harmlessness. However, the passive millimeter wave images (PMMWIs) suffer from low resolution and inherent noise. The published methods have rarely improved the quality of images for PMMWI and performed the detection only based on PMMWI with bounding box, which cause a high rate of false alarm. Moreover, it is difficult to identify the low-reflective non-metallic threats by the differences in grayscale. In this paper, a method of detecting concealed threats in human body is proposed. We introduce the GAN architecture to reconstruct high-quality images from multi-source PMMWIs. Meanwhile, we develop a novel detection pipeline involving semantic segmentation, image registration, and comprehensive analyzer. The segmentation network exploits multi-scale features to merge local and global information together in both PMMWIs and visible images to obtain precise shape and location information in the images, and the registration network is proposed for privacy concerns and the elimination of false alarms. With the grayscale and contour features, the detection for metallic and non-metallic threats can be conducted, respectively. After that, a synthetic strategy is applied to integrate the detection results of each single frame. In the numerical experiments, we evaluate the effectiveness of each module and the performance of the proposed method. Experimental results demonstrate that the proposed method outperforms the existing methods with 92.35% precision and 90.3% recall in our dataset, and also has a fast detection rate.
This study investigates the semantic segmentation of common concrete defects when using different imaging modalities. One pre-trained Convolutional Neural Network (CNN) model was trained via transfer learning and tested to detect concrete defect indications, such as cracks, spalling, and internal voids. The model’s performance was compared using datasets of visible, thermal, and fused images. The data were collected from four different concrete structures and built using four infrared cameras that have different sensitivities and resolutions, with imaging campaigns conducted during autumn, summer, and winter periods. Although specific defects can be detected in monomodal images, the results demonstrate that a larger number of defect classes can be accurately detected using multimodal fused images with the same viewpoint and resolution of the single-sensor image.