fusion strategy
Recently Published Documents


TOTAL DOCUMENTS

332
(FIVE YEARS 145)

H-INDEX

21
(FIVE YEARS 7)

2022 ◽  
Vol 40 (2) ◽  
pp. 1-36
Author(s):  
Lei Zhu ◽  
Chaoqun Zheng ◽  
Xu Lu ◽  
Zhiyong Cheng ◽  
Liqiang Nie ◽  
...  

Multi-modal hashing supports efficient multimedia retrieval well. However, existing methods still suffer from two problems: (1) Fixed multi-modal fusion. They collaborate the multi-modal features with fixed weights for hash learning, which cannot adaptively capture the variations of online streaming multimedia contents. (2) Binary optimization challenge. To generate binary hash codes, existing methods adopt either two-step relaxed optimization that causes significant quantization errors or direct discrete optimization that consumes considerable computation and storage cost. To address these problems, we first propose a Supervised Multi-modal Hashing with Online Query-adaption method. A self-weighted fusion strategy is designed to adaptively preserve the multi-modal features into hash codes by exploiting their complementarity. Besides, the hash codes are efficiently learned with the supervision of pair-wise semantic labels to enhance their discriminative capability while avoiding the challenging symmetric similarity matrix factorization. Further, we propose an efficient Unsupervised Multi-modal Hashing with Online Query-adaption method with an adaptive multi-modal quantization strategy. The hash codes are directly learned without the reliance on the specific objective formulations. Finally, in both methods, we design a parameter-free online hashing module to adaptively capture query variations at the online retrieval stage. Experiments validate the superiority of our proposed methods.


2022 ◽  
Vol 9 ◽  
Author(s):  
Ting-Wen Chen ◽  
Da-Wei Pang ◽  
Jian-Xin Kang ◽  
Dong-Feng Zhang ◽  
Lin Guo

In this paper, we report the construction of network-like platinum (Pt) nanosheets based on Pt/reduced graphite oxide (Pt/rGO) hybrids by delicately utilizing a calorific-effect-induced-fusion strategy. The tiny Pt species first catalyzed the H2-O2 combination reaction. The released heat triggered the combustion of the rGO substrate under the assistance of the Pt species catalysis, which induced the fusion of the tiny Pt species into a network-like nanosheet structure. The loading amount and dispersity of Pt on rGO are found to be crucial for the successful construction of network-like Pt nanosheets. The as-prepared products present excellent catalytic hydrogenation activity and superior stability towards unsaturated bonds such as olefins and nitrobenzene. The styrene can be completely converted into phenylethane within 60 min. The turnover frequency (TOF) value of network-like Pt nanosheets is as high as 158.14 h−1, which is three times higher than that of the home-made Pt nanoparticles and among the highest value of the support-free bimetallic catalysts ever reported under similar conditions. Furthermore, the well dispersibility and excellent aggregation resistance of the network-like structure endows the catalyst with excellent recyclability. The decline of conversion could be hardly identified after five times recycling experiments.


2022 ◽  
pp. 1-1
Author(s):  
Jiwen Zhou ◽  
Wendi Zhang ◽  
Yun Li ◽  
Xiaojian Wang ◽  
Li Zhang ◽  
...  

2021 ◽  
Vol 13 (23) ◽  
pp. 4928
Author(s):  
Yanming Chen ◽  
Xiaoqiang Liu ◽  
Yijia Xiao ◽  
Qiqi Zhao ◽  
Sida Wan

The heterogeneity of urban landscape in the vertical direction should not be neglected in urban ecology research, which requires urban land cover product transformation from two-dimensions to three-dimensions using light detection and ranging system (LiDAR) point clouds. Previous studies have demonstrated that the performance of two-dimensional land cover classification can be improved by fusing optical imagery and LiDAR data using several strategies. However, few studies have focused on the fusion of LiDAR point clouds and optical imagery for three-dimensional land cover classification, especially using a deep learning framework. In this study, we proposed a novel prior-level fusion strategy and compared it with the no-fusion strategy (baseline) and three other commonly used fusion strategies (point-level, feature-level, and decision-level). The proposed prior-level fusion strategy uses two-dimensional land cover derived from optical imagery as the prior knowledge for three-dimensional classification. Then, a LiDAR point cloud is linked to the prior information using the nearest neighbor method and classified by a deep neural network. Our proposed prior-fusion strategy has higher overall accuracy (82.47%) on data from the International Society for Photogrammetry and Remote Sensing, compared with the baseline (74.62%), point-level (79.86%), feature-level (76.22%), and decision-level (81.12%). The improved accuracy reflects two features: (1) fusing optical imagery to LiDAR point clouds improves the performance of three-dimensional urban land cover classification, and (2) the proposed prior-level strategy directly uses semantic information provided by the two-dimensional land cover classification rather than the original spectral information of optical imagery. Furthermore, the proposed prior-level fusion strategy provides a series that fills the gap between two- and three-dimensional land cover classification.


2021 ◽  
Vol 2132 (1) ◽  
pp. 012004
Author(s):  
Hangyu Zhu ◽  
Maoting Gao

Abstract Based on self-attention and outer product-based neural collaborative filtering,this paper proposed a SLAR model.The model uses the recent interaction information of each user in the group and self-attention mechanism to obtain the short-term interest vector of the group.The attention mechanism and self-attention mechanism are used to calculate the influence of each user and the influence between members during the interaction between the target group and item, so as to aggregate them into the long-term preference vector of the group, and then the sum of short-term interest and long-term preference is input into ONCF model as the embedding vector of the group to mine the interaction between the group and the project from the data, and finally complete the group recommendation. Compared with the traditional group fusion strategy on CAMR2011 data set, the experimental results show that SLGR model achieves better results.


2021 ◽  
Vol 16 (4) ◽  
Author(s):  
Bo Wang ◽  
Li Hu ◽  
Bowen Wei ◽  
Zitong Kang ◽  
Chongyi Li

2021 ◽  
Author(s):  
Yiu-ming Cheung ◽  
Zhikai Hu

<div><p>Unsupervised cross-modal retrieval has received increasing attention recently, because of the extreme difficulty of labeling the explosive multimedia data. The core challenge of it is how to measure the similarities between multi-modal data without label information. In previous works, various distance metrics are selected for measuring the similarities and predicting whether samples belong to the same class. However, these predictions are not always right. Unfortunately, even a few wrong predictions can undermine the final retrieval performance. To address this problem, in this paper, we categorize predictions as solid and soft ones based on their confidence. We further categorize samples as solid and soft ones based on the predictions. We propose that these two kinds of predictions and samples should be treated differently. Besides, we find that the absolute values of similarities can represent not only the similarity but also the confidence of the predictions. Thus, we first design an elegant dot product fusion strategy to obtain effective inter-modal similarities. Subsequently, utilizing these similarities, we propose a generalized and flexible weighted loss function where larger weights are assigned to solid samples to increase the retrieval performance, and smaller weights are assigned to soft samples to decrease the disturbance of wrong predictions. Despite less information is used, empirical studies show that the proposed approach achieves the state-of-the-art retrieval performance.</p><br></div>


2021 ◽  
Author(s):  
Yiu-ming Cheung ◽  
Zhikai Hu

<div><p>Unsupervised cross-modal retrieval has received increasing attention recently, because of the extreme difficulty of labeling the explosive multimedia data. The core challenge of it is how to measure the similarities between multi-modal data without label information. In previous works, various distance metrics are selected for measuring the similarities and predicting whether samples belong to the same class. However, these predictions are not always right. Unfortunately, even a few wrong predictions can undermine the final retrieval performance. To address this problem, in this paper, we categorize predictions as solid and soft ones based on their confidence. We further categorize samples as solid and soft ones based on the predictions. We propose that these two kinds of predictions and samples should be treated differently. Besides, we find that the absolute values of similarities can represent not only the similarity but also the confidence of the predictions. Thus, we first design an elegant dot product fusion strategy to obtain effective inter-modal similarities. Subsequently, utilizing these similarities, we propose a generalized and flexible weighted loss function where larger weights are assigned to solid samples to increase the retrieval performance, and smaller weights are assigned to soft samples to decrease the disturbance of wrong predictions. Despite less information is used, empirical studies show that the proposed approach achieves the state-of-the-art retrieval performance.</p><br></div>


2021 ◽  
Vol 2021 ◽  
pp. 1-13
Author(s):  
Zhou Fang ◽  
Qilin Wu ◽  
Darong Huang ◽  
Dashuai Guan

Dark channel prior (DCP) has been widely used in single image defogging because of its simple implementation and satisfactory performance. This paper addresses the shortcomings of the DCP-based defogging algorithm and proposes an optimized method by using an adaptive fusion mechanism. This proposed method makes full use of the smoothing and “squeezing” characteristics of the Logistic Function to obtain more reasonable dark channels avoiding further refining the transmission map. In addition, a maximum filtering on dark channels is taken to improve the accuracy of dark channels around the object boundaries and the overall brightness of the defogged clear images. Meanwhile, the location information and brightness information of fog image are weighed to obtain more accurate atmosphere light. Quantitative and qualitative comparisons show that the proposed method outperforms state-of-the-art image defogging algorithms.


Sign in / Sign up

Export Citation Format

Share Document