scholarly journals Evaluation of Multi-Stream Fusion for Multi-View Image Set Comparison

2021 ◽  
Vol 11 (13) ◽  
pp. 5863
Author(s):  
Paweł Piwowarski ◽  
Włodzimierz Kasprzak

We consider the problem of image set comparison, i.e., to determine whether two image sets show the same unique object (approximately) from the same viewpoints. Our proposition is to solve it by a multi-stream fusion of several image recognition paths. Immediate applications of this method can be found in fraud detection, deduplication procedure, or visual searching. The contribution of this paper is a novel distance measure for similarity of image sets and the experimental evaluation of several streams for the considered problem of same-car image set recognition. To determine a similarity score of image sets (this score expresses the certainty level that both sets represent the same object visible from the same set of views), we adapted a measure commonly applied in blind signal separation (BSS) evaluation. This measure is independent of the number of images in a set and the order of views in it. Separate streams for object classification (where a class represents either a car type or a car model-and-view) and object-to-object similarity evaluation (based on object features obtained alternatively by the convolutional neural network (CNN) or image keypoint descriptors) were designed. A late fusion by a fully-connected neural network (NN) completes the solution. The implementation is of modular structure—for semantic segmentation we use a Mask-RCNN (Mask regions with CNN features) with ResNet 101 as a backbone network; image feature extraction is either based on the DeepRanking neural network or classic keypoint descriptors (e.g., scale-invariant feature transform (SIFT)) and object classification is performed by two Inception V3 deep networks trained for car type-and-view and car model-and-view classification (4 views, 9 car types, and 197 car models are considered). Experiments conducted on the Stanford Cars dataset led to selection of the best system configuration that overperforms a base approach, allowing for a 67.7% GAR (genuine acceptance rate) at 3% FAR (false acceptance rate).

2011 ◽  
Vol 181-182 ◽  
pp. 37-42
Author(s):  
Xin Yu Li ◽  
Dong Yi Chen

Tracking and registration of camera and object is one of the most important issues in Augmented Reality (AR) systems. Markerless visual tracking technologies with image feature are used in many AR applications. Feature point based neural network image matching method has attracted considerable attention in recent years. This paper proposes an approach to feature point correspondence of image sequence based on transient chaotic neural networks. Rotation and scale invariant features are extracted from images firstly, and then transient chaotic neural network is used to perform global feature matching and perform the initialization phase of the tracking. Experimental results demonstrate the efficiency and the effectiveness of the proposed method.


2020 ◽  
Vol 64 (1) ◽  
pp. 10505-1-10505-16
Author(s):  
Yin Zhang ◽  
Xuehan Bai ◽  
Junhua Yan ◽  
Yongqi Xiao ◽  
C. R. Chatwin ◽  
...  

Abstract A new blind image quality assessment method called No-Reference Image Quality Assessment Based on Multi-Order Gradients Statistics is proposed, which is aimed at solving the problem that the existing no-reference image quality assessment methods cannot determine the type of image distortion and that the quality evaluation has poor robustness for different types of distortion. In this article, an 18-dimensional image feature vector is constructed from gradient magnitude features, relative gradient orientation features, and relative gradient magnitude features over two scales and three orders on the basis of the relationship between multi-order gradient statistics and the type and degree of image distortion. The feature matrix and distortion types of known distorted images are used to train an AdaBoost_BP neural network to determine the image distortion type; the feature matrix and subjective scores of known distorted images are used to train an AdaBoost_BP neural network to determine the image distortion degree. A series of comparative experiments were carried out using Laboratory of Image and Video Engineering (LIVE), LIVE Multiply Distorted Image Quality, Tampere Image, and Optics Remote Sensing Image databases. Experimental results show that the proposed method has high distortion type judgment accuracy and that the quality score shows good subjective consistency and robustness for all types of distortion. The performance of the proposed method is not constricted to a particular database, and the proposed method has high operational efficiency.


Author(s):  
Liang Kim Meng ◽  
Azira Khalil ◽  
Muhamad Hanif Ahmad Nizar ◽  
Maryam Kamarun Nisham ◽  
Belinda Pingguan-Murphy ◽  
...  

Background: Bone Age Assessment (BAA) refers to a clinical procedure that aims to identify a discrepancy between biological and chronological age of an individual by assessing the bone age growth. Currently, there are two main methods of executing BAA which are known as Greulich-Pyle and Tanner-Whitehouse techniques. Both techniques involve a manual and qualitative assessment of hand and wrist radiographs, resulting in intra and inter-operator variability accuracy and time-consuming. An automatic segmentation can be applied to the radiographs, providing the physician with more accurate delineation of the carpal bone and accurate quantitative analysis. Methods: In this study, we proposed an image feature extraction technique based on image segmentation with the fully convolutional neural network with eight stride pixel (FCN-8). A total of 290 radiographic images including both female and the male subject of age ranging from 0 to 18 were manually segmented and trained using FCN-8. Results and Conclusion: The results exhibit a high training accuracy value of 99.68% and a loss rate of 0.008619 for 50 epochs of training. The experiments compared 58 images against the gold standard ground truth images. The accuracy of our fully automated segmentation technique is 0.78 ± 0.06, 1.56 ±0.30 mm and 98.02% in terms of Dice Coefficient, Hausdorff Distance, and overall qualitative carpal recognition accuracy, respectively.


2021 ◽  
Vol 11 (15) ◽  
pp. 7148
Author(s):  
Bedada Endale ◽  
Abera Tullu ◽  
Hayoung Shi ◽  
Beom-Soo Kang

Unmanned aerial vehicles (UAVs) are being widely utilized for various missions: in both civilian and military sectors. Many of these missions demand UAVs to acquire artificial intelligence about the environments they are navigating in. This perception can be realized by training a computing machine to classify objects in the environment. One of the well known machine training approaches is supervised deep learning, which enables a machine to classify objects. However, supervised deep learning comes with huge sacrifice in terms of time and computational resources. Collecting big input data, pre-training processes, such as labeling training data, and the need for a high performance computer for training are some of the challenges that supervised deep learning poses. To address these setbacks, this study proposes mission specific input data augmentation techniques and the design of light-weight deep neural network architecture that is capable of real-time object classification. Semi-direct visual odometry (SVO) data of augmented images are used to train the network for object classification. Ten classes of 10,000 different images in each class were used as input data where 80% were for training the network and the remaining 20% were used for network validation. For the optimization of the designed deep neural network, a sequential gradient descent algorithm was implemented. This algorithm has the advantage of handling redundancy in the data more efficiently than other algorithms.


2011 ◽  
Vol 65 ◽  
pp. 497-502
Author(s):  
Yan Wei Wang ◽  
Hui Li Yu

A feature matching algorithm based on wavelet transform and SIFT is proposed in this paper, Firstly, Biorthogonal wavelet transforms algorithm is used for medical image to delaminating, and restoration the processed image. Then the SIFT (Scale Invariant Feature Transform) applied in this paper to abstracting key point. Experimental results show that our algorithm compares favorably in high-compressive ratio, the rapid matching speed and low storage of the image, especially for the tilt and rotation conditions.


The rapid expansion and improvement in medical science and technology lead to the generation of more image data in its regular activity such as computed tomography (CT), X-ray, magnetic resonance imaging (MRI) etc. To manage the medical images properly for clinical decision making, content-based medical image retrieval (CBMIR) system emerged. In this paper, Pulse Coupled Neural Network (PCNN) based feature descriptor is proposed for retrieval of biomedical images. Time series is used as an image feature which contains the entire information of the feature, based on which the similar biomedical images are retrieved in our work. Here, the physician can point out the disorder present in the patient report by retrieving the most similar report from related reference reports. Open Access Series of Imaging Studies (OASIS) magnetic resonance imaging dataset is used for the evaluation of the proposed approach. The experimental result of the proposed system shows that the retrieval efficiency is better than the other existing systems.


Complexity ◽  
2021 ◽  
Vol 2021 ◽  
pp. 1-10
Author(s):  
Li Ma ◽  
Xueliang Guo ◽  
Shuke Zhao ◽  
Doudou Yin ◽  
Yiyi Fu ◽  
...  

The growth of strawberry will be stressed by biological or abiotic factors, which will cause a great threat to the yield and quality of strawberry, in which various strawberry diseased. However, the traditional identification methods have high misjudgment rate and poor real-time performance. In today's era of increasing demand for strawberry yield and quality, it is obvious that the traditional strawberry disease identification methods mainly rely on personal experience and naked eye observation and cannot meet the needs of people for strawberry disease identification and control. Therefore, it is necessary to find a more effective method to identify strawberry diseases efficiently and provide corresponding disease description and control methods. In this paper, based on the deep convolution neural network technology, the recognition of strawberry common diseases was studied, as well as a new method based on deep convolution neural network (DCNN) strawberry disease recognition algorithm, through the normal training of strawberry image feature representation in different scenes, and then through the application of transfer learning method, the strawberry disease image features are added to the training set, and finally the features are classified and recognized to achieve the goal of disease recognition. Moreover, attention mechanism and central damage function are introduced into the classical convolutional neural network to solve the problem that the information loss of key feature areas in the existing classification methods of convolutional neural network affects the classification effect, and further improves the accuracy of convolutional neural network in image classification.


2017 ◽  
Author(s):  
Dat Duong ◽  
Wasi Uddin Ahmad ◽  
Eleazar Eskin ◽  
Kai-Wei Chang ◽  
Jingyi Jessica Li

AbstractThe Gene Ontology (GO) database contains GO terms that describe biological functions of genes. Previous methods for comparing GO terms have relied on the fact that GO terms are organized into a tree structure. In this paradigm, the locations of two GO terms in the tree dictate their similarity score. In this paper, we introduce two new solutions for this problem, by focusing instead on the definitions of the GO terms. We apply neural network based techniques from the natural language processing (NLP) domain. The first method does not rely on the GO tree, whereas the second indirectly depends on the GO tree. In our first approach, we compare two GO definitions by treating them as two unordered sets of words. The word similarity is estimated by a word embedding model that maps words into an N-dimensional space. In our second approach, we account for the word-ordering within a sentence. We use a sentence encoder to embed GO definitions into vectors and estimate how likely one definition entails another. We validate our methods in two ways. In the first experiment, we test the model’s ability to differentiate a true protein-protein network from a randomly generated network. In the second experiment, we test the model in identifying orthologs from randomly-matched genes in human, mouse, and fly. In both experiments, a hybrid of NLP and GO-tree based method achieves the best classification accuracy.Availabilitygithub.com/datduong/NLPMethods2CompareGOterms


Sign in / Sign up

Export Citation Format

Share Document