Cross-Modal Semantic-Associative Labelling, Indexing and Retrieval of Multimodal Data

Author(s):  
Meng Zhu ◽  
Atta Badii

Digitalised multimedia information today is typically represented in different modalities and distributed through various channels. The use of such a huge amount of data is highly dependent on effective and efficient cross-modal labelling, indexing and retrieval of multimodal information. In this Chapter, we mainly focus on the combining of the primary and collateral modalities of the information resource in an intelligent and effective way in order to provide better multimodal information understanding, classification, labelling and retrieval. Image and text are the two modalities we mainly talk about here. A novel framework for semantic-based collaterally cued image labelling had been proposed and implemented, aiming to automatically assign linguistic keywords to regions of interest in an image. A visual vocabulary was constructed based on manually labelled image segments. We use Euclidean distance and Gaussian distribution to map the low-level region-based image features to the high-level visual concepts defined in the visual vocabulary. Both the collateral content and context knowledge were extracted from the collateral textual modality to bias the mapping process. A semantic-based high-level image feature vector model was constructed based on the labelling results, and the performance of image retrieval using this feature vector model appears to outperform both content-based and text-based approaches in terms of its capability for combining both perceptual and conceptual similarity of the image content.

Spam features represent the unique and special characteristics associated with spam, which are further used to differentiate them from other genuine messages. Each message m is processed by a feature extraction module to represent m in terms of n dimensional feature vector x = (x1, x2, …, xn) containing n features. This feature vector consists of many such features extracted from spam. In case of text based spam filters, a feature can be a word and a feature vector may be composed of various words extracted from spam. Each spam is associated with one feature vector. Based on the characteristics discussed in previous chapter, we will try to extract different features capturing those unique characteristics from image spam, in order to build the robust spam detection algorithms further. These features are broadly classified into high level metadata features, low level image features like color features, grayscale features, texture related features and embedded text related features.


Author(s):  
W. Krakow ◽  
D. A. Smith

The successful determination of the atomic structure of [110] tilt boundaries in Au stems from the investigation of microscope performance at intermediate accelerating voltages (200 and 400kV) as well as a detailed understanding of how grain boundary image features depend on dynamical diffraction processes variation with specimen and beam orientations. This success is also facilitated by improving image quality by digital image processing techniques to the point where a structure image is obtained and each atom position is represented by a resolved image feature. Figure 1 shows an example of a low angle (∼10°) Σ = 129/[110] tilt boundary in a ∼250Å Au film, taken under tilted beam brightfield imaging conditions, to illustrate the steps necessary to obtain the atomic structure configuration from the image. The original image of Fig. 1a shows the regular arrangement of strain-field images associated with the cores of ½ [10] primary dislocations which are separated by ∼15Å.


2016 ◽  
Vol 20 (2) ◽  
pp. 191-201 ◽  
Author(s):  
Wei Lu ◽  
Yan Cui ◽  
Jun Teng

To decrease the cost of instrumentation for the strain and displacement monitoring method that uses sensors as well as considers the structural health monitoring challenges in sensor installation, it is necessary to develop a machine vision-based monitoring method. For this method, the most important step is the accurate extraction of the image feature. In this article, the edge detection operator based on multi-scale structure elements and the compound mathematical morphological operator is proposed to provide improved image feature extraction. The proposed method can not only achieve an improved filtering effect and anti-noise ability but can also detect the edge more accurately. Furthermore, the required image features (vertex of a square calibration board and centroid of a circular target) can be accurately extracted using the extracted image edge information. For validation, the monitoring tests for the structural local mean strain and in-plane displacement were designed accordingly. Through analysis of the error between the measured and calculated values of the structural strain and displacement, the feasibility and effectiveness of the proposed edge detection operator are verified.


Sensors ◽  
2021 ◽  
Vol 21 (16) ◽  
pp. 5312
Author(s):  
Yanni Zhang ◽  
Yiming Liu ◽  
Qiang Li ◽  
Jianzhong Wang ◽  
Miao Qi ◽  
...  

Recently, deep learning-based image deblurring and deraining have been well developed. However, most of these methods fail to distill the useful features. What is more, exploiting the detailed image features in a deep learning framework always requires a mass of parameters, which inevitably makes the network suffer from a high computational burden. We propose a lightweight fusion distillation network (LFDN) for image deblurring and deraining to solve the above problems. The proposed LFDN is designed as an encoder–decoder architecture. In the encoding stage, the image feature is reduced to various small-scale spaces for multi-scale information extraction and fusion without much information loss. Then, a feature distillation normalization block is designed at the beginning of the decoding stage, which enables the network to distill and screen valuable channel information of feature maps continuously. Besides, an information fusion strategy between distillation modules and feature channels is also carried out by the attention mechanism. By fusing different information in the proposed approach, our network can achieve state-of-the-art image deblurring and deraining results with a smaller number of parameters and outperform the existing methods in model complexity.


Open Physics ◽  
2018 ◽  
Vol 16 (1) ◽  
pp. 1033-1045
Author(s):  
Guodong Zhou ◽  
Huailiang Zhang ◽  
Raquel Martínez Lucas

Abstract Aiming at the excellent descriptive ability of SURF operator for local features of images, except for the shortcoming of global feature description ability, a compressed sensing image restoration algorithm based on improved SURF operator is proposed. The SURF feature vector set of the image is extracted, and the vector set data is reduced into a single high-dimensional feature vector by using a histogram algorithm, and then the image HSV color histogram is extracted.MSA image decomposition algorithm is used to obtain sparse representation of image feature vectors. Total variation curvature diffusion method and Bayesian weighting method perform image restoration for data smoothing feature and local similarity feature of texture part respectively. A compressed sensing image restoration model is obtained by using Schatten-p norm, and image color supplement is performed on the model. The compressed sensing image is iteratively solved by alternating optimization method, and the compressed sensing image is restored. The experimental results show that the proposed algorithm has good restoration performance, and the restored image has finer edge and texture structure and better visual effect.


Sensors ◽  
2019 ◽  
Vol 19 (2) ◽  
pp. 291 ◽  
Author(s):  
Hamdi Sahloul ◽  
Shouhei Shirafuji ◽  
Jun Ota

Local image features are invariant to in-plane rotations and robust to minor viewpoint changes. However, the current detectors and descriptors for local image features fail to accommodate out-of-plane rotations larger than 25°–30°. Invariance to such viewpoint changes is essential for numerous applications, including wide baseline matching, 6D pose estimation, and object reconstruction. In this study, we present a general embedding that wraps a detector/descriptor pair in order to increase viewpoint invariance by exploiting input depth maps. The proposed embedding locates smooth surfaces within the input RGB-D images and projects them into a viewpoint invariant representation, enabling the detection and description of more viewpoint invariant features. Our embedding can be utilized with different combinations of descriptor/detector pairs, according to the desired application. Using synthetic and real-world objects, we evaluated the viewpoint invariance of various detectors and descriptors, for both standalone and embedded approaches. While standalone local image features fail to accommodate average viewpoint changes beyond 33.3°, our proposed embedding boosted the viewpoint invariance to different levels, depending on the scene geometry. Objects with distinct surface discontinuities were on average invariant up to 52.8°, and the overall average for all evaluated datasets was 45.4°. Similarly, out of a total of 140 combinations involving 20 local image features and various objects with distinct surface discontinuities, only a single standalone local image feature exceeded the goal of 60° viewpoint difference in just two combinations, as compared with 19 different local image features succeeding in 73 combinations when wrapped in the proposed embedding. Furthermore, the proposed approach operates robustly in the presence of input depth noise, even that of low-cost commodity depth sensors, and well beyond.


Atmosphere ◽  
2021 ◽  
Vol 12 (7) ◽  
pp. 828
Author(s):  
Wai Lun Lo ◽  
Henry Shu Hung Chung ◽  
Hong Fu

Estimation of Meteorological visibility from image characteristics is a challenging problem in the research of meteorological parameters estimation. Meteorological visibility can be used to indicate the weather transparency and this indicator is important for transport safety. This paper summarizes the outcomes of the experimental evaluation of a Particle Swarm Optimization (PSO) based transfer learning method for meteorological visibility estimation method. This paper proposes a modified approach of the transfer learning method for visibility estimation by using PSO feature selection. Image data are collected at fixed location with fixed viewing angle. The database images were gone through a pre-processing step of gray-averaging so as to provide information of static landmark objects for automatic extraction of effective regions from images. Effective regions are then extracted from image database and the image features are then extracted from the Neural Network. Subset of Image features are selected based on the Particle Swarming Optimization (PSO) methods to obtain the image feature vectors for each effective sub-region. The image feature vectors are then used to estimate the visibilities of the images by using the Multiple Support Vector Regression (SVR) models. Experimental results show that the proposed method can give an accuracy more than 90% for visibility estimation and the proposed method is effective and robust.


Author(s):  
Bo Wang ◽  
Xiaoting Yu ◽  
Chengeng Huang ◽  
Qinghong Sheng ◽  
Yuanyuan Wang ◽  
...  

The excellent feature extraction ability of deep convolutional neural networks (DCNNs) has been demonstrated in many image processing tasks, by which image classification can achieve high accuracy with only raw input images. However, the specific image features that influence the classification results are not readily determinable and what lies behind the predictions is unclear. This study proposes a method combining the Sobel and Canny operators and an Inception module for ship classification. The Sobel and Canny operators obtain enhanced edge features from the input images. A convolutional layer is replaced with the Inception module, which can automatically select the proper convolution kernel for ship objects in different image regions. The principle is that the high-level features abstracted by the DCNN, and the features obtained by multi-convolution concatenation of the Inception module must ultimately derive from the edge information of the preprocessing input images. This indicates that the classification results are based on the input edge features, which indirectly interpret the classification results to some extent. Experimental results show that the combination of the edge features and the Inception module improves DCNN ship classification performance. The original model with the raw dataset has an average accuracy of 88.72%, while when using enhanced edge features as input, it achieves the best performance of 90.54% among all models. The model that replaces the fifth convolutional layer with the Inception module has the best performance of 89.50%. It performs close to VGG-16 on the raw dataset and is significantly better than other deep neural networks. The results validate the functionality and feasibility of the idea posited.


2011 ◽  
Vol 2011 ◽  
pp. 1-14 ◽  
Author(s):  
Jinjun Li ◽  
Hong Zhao ◽  
Chengying Shi ◽  
Xiang Zhou

A stereo similarity function based on local multi-model monogenic image feature descriptors (LMFD) is proposed to match interest points and estimate disparity map for stereo images. Local multi-model monogenic image features include local orientation and instantaneous phase of the gray monogenic signal, local color phase of the color monogenic signal, and local mean colors in the multiscale color monogenic signal framework. The gray monogenic signal, which is the extension of analytic signal to gray level image using Dirac operator and Laplace equation, consists of local amplitude, local orientation, and instantaneous phase of 2D image signal. The color monogenic signal is the extension of monogenic signal to color image based on Clifford algebras. The local color phase can be estimated by computing geometric product between the color monogenic signal and a unit reference vector in RGB color space. Experiment results on the synthetic and natural stereo images show the performance of the proposed approach.


Sign in / Sign up

Export Citation Format

Share Document