Environment Classification and Deinterleaving using Siamese Networks and Few-Shot Learning

Author(s):  
Cesar Martinez Melgoza ◽  
Tyler Groom ◽  
Henry Lin ◽  
Ameya Govalkar ◽  
Kayla Lee ◽  
...  
Keyword(s):  
Author(s):  
Bjørn Magnus Mathisen ◽  
Kerstin Bach ◽  
Agnar Aamodt

AbstractAquaculture as an industry is quickly expanding. As a result, new aquaculture sites are being established at more exposed locations previously deemed unfit because they are more difficult and resource demanding to safely operate than are traditional sites. To help the industry deal with these challenges, we have developed a decision support system to support decision makers in establishing better plans and make decisions that facilitate operating these sites in an optimal manner. We propose a case-based reasoning system called aquaculture case-based reasoning (AQCBR), which is able to predict the success of an aquaculture operation at a specific site, based on previously applied and recorded cases. In particular, AQCBR is trained to learn a similarity function between recorded operational situations/cases and use the most similar case to provide explanation-by-example information for its predictions. The novelty of AQCBR is that it uses extended Siamese neural networks to learn the similarity between cases. Our extensive experimental evaluation shows that extended Siamese neural networks outperform state-of-the-art methods for similarity learning in this task, demonstrating the effectiveness and the feasibility of our approach.


Sensors ◽  
2021 ◽  
Vol 21 (5) ◽  
pp. 1573
Author(s):  
Loris Nanni ◽  
Giovanni Minchio ◽  
Sheryl Brahnam ◽  
Gianluca Maguolo ◽  
Alessandra Lumini

Traditionally, classifiers are trained to predict patterns within a feature space. The image classification system presented here trains classifiers to predict patterns within a vector space by combining the dissimilarity spaces generated by a large set of Siamese Neural Networks (SNNs). A set of centroids from the patterns in the training data sets is calculated with supervised k-means clustering. The centroids are used to generate the dissimilarity space via the Siamese networks. The vector space descriptors are extracted by projecting patterns onto the similarity spaces, and SVMs classify an image by its dissimilarity vector. The versatility of the proposed approach in image classification is demonstrated by evaluating the system on different types of images across two domains: two medical data sets and two animal audio data sets with vocalizations represented as images (spectrograms). Results show that the proposed system’s performance competes competitively against the best-performing methods in the literature, obtaining state-of-the-art performance on one of the medical data sets, and does so without ad-hoc optimization of the clustering methods on the tested data sets.


Sensors ◽  
2018 ◽  
Vol 18 (11) ◽  
pp. 3669 ◽  
Author(s):  
Rui Sun ◽  
Qiheng Huang ◽  
Miaomiao Xia ◽  
Jun Zhang

Video-based person re-identification is an important task with the challenges of lighting variation, low-resolution images, background clutter, occlusion, and human appearance similarity in the multi-camera visual sensor networks. In this paper, we propose a video-based person re-identification method called the end-to-end learning architecture with hybrid deep appearance-temporal feature. It can learn the appearance features of pivotal frames, the temporal features, and the independent distance metric of different features. This architecture consists of two-stream deep feature structure and two Siamese networks. For the first-stream structure, we propose the Two-branch Appearance Feature (TAF) sub-structure to obtain the appearance information of persons, and used one of the two Siamese networks to learn the similarity of appearance features of a pairwise person. To utilize the temporal information, we designed the second-stream structure that consisting of the Optical flow Temporal Feature (OTF) sub-structure and another Siamese network, to learn the person’s temporal features and the distances of pairwise features. In addition, we select the pivotal frames of video as inputs to the Inception-V3 network on the Two-branch Appearance Feature sub-structure, and employ the salience-learning fusion layer to fuse the learned global and local appearance features. Extensive experimental results on the PRID2011, iLIDS-VID, and Motion Analysis and Re-identification Set (MARS) datasets showed that the respective proposed architectures reached 79%, 59% and 72% at Rank-1 and had advantages over state-of-the-art algorithms. Meanwhile, it also improved the feature representation ability of persons.


Author(s):  
Kareem Amin ◽  
George Lancaster ◽  
Stelios Kapetanakis ◽  
Klaus-Dieter Althoff ◽  
Andreas Dengel ◽  
...  

Sign in / Sign up

Export Citation Format

Share Document