scholarly journals Fusing Feature Distribution Entropy with R-MAC Features in Image Retrieval

Entropy ◽  
2019 ◽  
Vol 21 (11) ◽  
pp. 1037 ◽  
Author(s):  
Pingping Liu ◽  
Guixia Gou ◽  
Huili Guo ◽  
Danyang Zhang ◽  
Hongwei Zhao ◽  
...  

Image retrieval based on a convolutional neural network (CNN) has attracted great attention among researchers because of the high performance. The pooling method has become a research hotpot in the task of image retrieval in recent years. In this paper, we propose the feature distribution entropy (FDE) to measure the difference of regional distribution information in the feature maps from CNNs. We propose a novel pooling method, which fuses our proposed FDE with region maximum activations of convolutions (R-MAC) features to improve the performance of image retrieval, as it takes the advantage of regional distribution information in the feature maps. Compared with the descriptors computed by R-MAC pooling, our proposed method considers not only the most significant feature values of each region in feature map, but also the distribution difference in different regions. We utilize the histogram of feature values to calculate regional distribution entropy and concatenate the regional distribution entropy into FDE, which is further normalized and fused with R-MAC feature vectors by weighted summation to generate the final feature descriptors. We have conducted experiments on public datasets and the results demonstrate that our proposed method could produce better retrieval performances than existing state-of-the-art algorithms. Further, higher performance could be achieved by performing these post-processing on the improved feature descriptors.

Author(s):  
Yang Yi ◽  
Feng Ni ◽  
Yuexin Ma ◽  
Xinge Zhu ◽  
Yuankai Qi ◽  
...  

State-of-the-art hand gesture recognition methods have investigated the spatiotemporal features based on 3D convolutional neural networks (3DCNNs) or convolutional long short-term memory (ConvLSTM). However, they often suffer from the inefficiency due to the high computational complexity of their network structures. In this paper, we focus instead on the 1D convolutional neural networks and propose a simple and efficient architectural unit, Multi-Kernel Temporal Block (MKTB), that models the multi-scale temporal responses by explicitly applying different temporal kernels. Then, we present a Global Refinement Block (GRB), which is an attention module for shaping the global temporal features based on the cross-channel similarity. By incorporating the MKTB and GRB, our architecture can effectively explore the spatiotemporal features within tolerable computational cost. Extensive experiments conducted on public datasets demonstrate that our proposed model achieves the state-of-the-art with higher efficiency. Moreover, the proposed MKTB and GRB are plug-and-play modules and the experiments on other tasks, like video understanding and video-based person re-identification, also display their good performance in efficiency and capability of generalization.


Sensors ◽  
2020 ◽  
Vol 20 (17) ◽  
pp. 4869
Author(s):  
Shenggui Ling ◽  
Ye Lin ◽  
Keren Fu ◽  
Di You ◽  
Peng Cheng

In recent years, Generative Adversarial Networks (GANs)-based illumination processing of facial images has made favorable achievements. However, some GANs-based illumination-processing methods only pay attention to the image quality and neglect the recognition accuracy, whereas others only crop partial face area and ignore the challenges to synthesize photographic face, background and hair when the original face image is under extreme illumination (Image under extreme illumination (extreme illumination conditions) means that we cannot see the texture and structure information clearly and most pixel values tend to 0 or 255.). Moreover, the recognition accuracy is low when the faces are under extreme illumination conditions. For these reasons, we present an elaborately designed architecture based on convolutional neural network and GANs for processing the illumination of facial image. We use ResBlock at the down-sampling stage in our encoder and adopt skip connections in our generator. This special design together with our loss can enhance the ability to preserve identity and generate high-quality images. Moreover, we use different convolutional layers of a pre-trained feature network to extract varisized feature maps, and then use these feature maps to compute loss, which is named multi-stage feature maps (MSFM) loss. For the sake of fairly evaluating our method against state-of-the-art models, we use four metrics to estimate the performance of illumination-processing algorithms. A variety of experimental data indicate that our method is superior to the previous models under various illumination challenges in illumination processing. We conduct qualitative and quantitative experiments on two datasets, and the experimental data indicate that our scheme obviously surpasses the state-of-the-art algorithms in image quality and identification accuracy.


2020 ◽  
Vol 17 (3) ◽  
pp. 172988142092566
Author(s):  
Dahan Wang ◽  
Sheng Luo ◽  
Li Zhao ◽  
Xiaoming Pan ◽  
Muchou Wang ◽  
...  

Fire is a fierce disaster, and smoke is the early signal of fire. Since such features as chrominance, texture, and shape of smoke are very special, a lot of methods based on these features have been developed. But these static characteristics vary widely, so there are some exceptions leading to low detection accuracy. On the other side, the motion of smoke is much more discriminating than the aforementioned features, so a time-domain neural network is proposed to extract its dynamic characteristics. This smoke recognition network has these advantages:(1) extract the spatiotemporal with the 3D filters which work on dynamic and static characteristics synchronously; (2) high accuracy, 87.31% samples being classified rightly, which is the state of the art even in a chaotic environments, and the fuzzy objects for other methods, such as haze, fog, and climbing cars, are distinguished distinctly; (3) high sensitiveness, smoke being detected averagely at the 23rd frame, which is also the state of the art, which is meaningful to alarm early fire as soon as possible; and (4) it is not been based on any hypothesis, which guarantee the method compatible. Finally, a new metric, the difference between the first frame in which smoke is detected and the first frame in which smoke happens, is proposed to compare the algorithms sensitivity in videos. The experiments confirm that the dynamic characteristics are more discriminating than the aforementioned static characteristics, and smoke recognition network is a good tool to extract compound feature.


2008 ◽  
Vol 600-603 ◽  
pp. 895-900 ◽  
Author(s):  
Anant K. Agarwal ◽  
Albert A. Burk ◽  
Robert Callanan ◽  
Craig Capell ◽  
Mrinal K. Das ◽  
...  

In this paper, we review the state of the art of SiC switches and the technical issues which remain. Specifically, we will review the progress and remaining challenges associated with SiC power MOSFETs and BJTs. The most difficult issue when fabricating MOSFETs has been an excessive variation in threshold voltage from batch to batch. This difficulty arises due to the fact that the threshold voltage is determined by the difference between two large numbers, namely, a large fixed oxide charge and a large negative charge in the interface traps. There may also be some significant charge captured in the bulk traps in SiC and SiO2. The effect of recombination-induced stacking faults (SFs) on majority carrier mobility has been confirmed with 10 kV Merged PN Schottky (MPS) diodes and MOSFETs. The same SFs have been found to be responsible for degradation of BJTs.


Author(s):  
Wei Huang ◽  
Xiaoshu Zhou ◽  
Mingchao Dong ◽  
Huaiyu Xu

AbstractRobust and high-performance visual multi-object tracking is a big challenge in computer vision, especially in a drone scenario. In this paper, an online Multi-Object Tracking (MOT) approach in the UAV system is proposed to handle small target detections and class imbalance challenges, which integrates the merits of deep high-resolution representation network and data association method in a unified framework. Specifically, while applying tracking-by-detection architecture to our tracking framework, a Hierarchical Deep High-resolution network (HDHNet) is proposed, which encourages the model to handle different types and scales of targets, and extract more effective and comprehensive features during online learning. After that, the extracted features are fed into different prediction networks for interesting targets recognition. Besides, an adjustable fusion loss function is proposed by combining focal loss and GIoU loss to solve the problems of class imbalance and hard samples. During the tracking process, these detection results are applied to an improved DeepSORT MOT algorithm in each frame, which is available to make full use of the target appearance features to match one by one on a practical basis. The experimental results on the VisDrone2019 MOT benchmark show that the proposed UAV MOT system achieves the highest accuracy and the best robustness compared with state-of-the-art methods.


Materials ◽  
2021 ◽  
Vol 14 (9) ◽  
pp. 2455
Author(s):  
Jiayuan He ◽  
Weizhen Chen ◽  
Boshan Zhang ◽  
Jiangjiang Yu ◽  
Hang Liu

Due to the sharp and corrosion-prone features of steel fibers, there is a demand for ultra-high-performance concrete (UHPC) reinforced with nonmetallic fibers. In this paper, glass fiber (GF) and the high-performance polypropylene (HPP) fiber were selected to prepare UHPC, and the effects of different fibers on the compressive, tensile and bending properties of UHPC were investigated, experimentally and numerically. Then, the damage evolution of UHPC was further studied numerically, adopting the concrete damaged plasticity (CDP) model. The difference between the simulation values and experimental values was within 5.0%, verifying the reliability of the numerical model. The results indicate that 2.0% fiber content in UHPC provides better mechanical properties. In addition, the glass fiber was more significant in strengthening the effect. Compared with HPP-UHPC, the compressive, tensile and flexural strength of GF-UHPC increased by about 20%, 30% and 40%, respectively. However, the flexural toughness indexes I5, I10 and I20 of HPP-UHPC were about 1.2, 2.0 and 3.8 times those of GF-UHPC, respectively, showing that the toughening effect of the HPP fiber is better.


2021 ◽  
Vol 13 (11) ◽  
pp. 2171
Author(s):  
Yuhao Qing ◽  
Wenyi Liu ◽  
Liuyan Feng ◽  
Wanjia Gao

Despite significant progress in object detection tasks, remote sensing image target detection is still challenging owing to complex backgrounds, large differences in target sizes, and uneven distribution of rotating objects. In this study, we consider model accuracy, inference speed, and detection of objects at any angle. We also propose a RepVGG-YOLO network using an improved RepVGG model as the backbone feature extraction network, which performs the initial feature extraction from the input image and considers network training accuracy and inference speed. We use an improved feature pyramid network (FPN) and path aggregation network (PANet) to reprocess feature output by the backbone network. The FPN and PANet module integrates feature maps of different layers, combines context information on multiple scales, accumulates multiple features, and strengthens feature information extraction. Finally, to maximize the detection accuracy of objects of all sizes, we use four target detection scales at the network output to enhance feature extraction from small remote sensing target pixels. To solve the angle problem of any object, we improved the loss function for classification using circular smooth label technology, turning the angle regression problem into a classification problem, and increasing the detection accuracy of objects at any angle. We conducted experiments on two public datasets, DOTA and HRSC2016. Our results show the proposed method performs better than previous methods.


Sign in / Sign up

Export Citation Format

Share Document