scholarly journals Auxiliary Template-Enhanced Generative Compatibility Modeling

Author(s):  
Jinhuan Liu ◽  
Xuemeng Song ◽  
Zhaochun Ren ◽  
Liqiang Nie ◽  
Zhaopeng Tu ◽  
...  

In recent years, there has been a growing interest in the fashion analysis (e.g., clothing matching) due to the huge economic value of the fashion industry. The essential problem is to model the compatibility between the complementary fashion items, such as the top and bottom in clothing matching. The majority of existing work on fashion analysis has focused on measuring the item-item compatibility in a latent space with deep learning methods. In this work, we aim to improve the compatibility modeling by sketching a compatible template for a given item as an auxiliary link between fashion items. Specifically, we propose an end-to-end Auxiliary Template-enhanced Generative Compatibility Modeling (AT-GCM) scheme, which introduces an auxiliary complementary template generation network equipped with the pixel-wise consistency and compatible template regularization. Extensive experiments on two real-world datasets demonstrate the superiority of the proposed approach.

Author(s):  
Peng Hu ◽  
Rong Du ◽  
Yao Hu ◽  
Nan Li

Nowadays, item-item recommendation plays an important role in modern recommender systems. Traditionally, this is either solved by behavior-based collaborative filtering or content-based meth- ods. However, both kinds of methods often suffer from cold-start problems, or poor performance due to few behavior supervision; and hybrid methods which can leverage the strength of both kinds of methods are needed. In this paper, we propose a semi-parametric embedding framework for this problem. Specifically, the embedding of an item is composed of two parts, i.e., the parametric part from content information and the non-parametric part designed to encode behavior information; meanwhile, a deep learning algorithm is proposed to learn two parts simultaneously. Extensive experiments on real-world datasets demonstrate the effectiveness and robustness of the proposed method.


Sensors ◽  
2021 ◽  
Vol 21 (19) ◽  
pp. 6661
Author(s):  
Lars Schmarje ◽  
Johannes Brünger ◽  
Monty Santarossa ◽  
Simon-Martin Schröder ◽  
Rainer Kiko ◽  
...  

Deep learning has been successfully applied to many classification problems including underwater challenges. However, a long-standing issue with deep learning is the need for large and consistently labeled datasets. Although current approaches in semi-supervised learning can decrease the required amount of annotated data by a factor of 10 or even more, this line of research still uses distinct classes. For underwater classification, and uncurated real-world datasets in general, clean class boundaries can often not be given due to a limited information content in the images and transitional stages of the depicted objects. This leads to different experts having different opinions and thus producing fuzzy labels which could also be considered ambiguous or divergent. We propose a novel framework for handling semi-supervised classifications of such fuzzy labels. It is based on the idea of overclustering to detect substructures in these fuzzy labels. We propose a novel loss to improve the overclustering capability of our framework and show the benefit of overclustering for fuzzy labels. We show that our framework is superior to previous state-of-the-art semi-supervised methods when applied to real-world plankton data with fuzzy labels. Moreover, we acquire 5 to 10% more consistent predictions of substructures.


Author(s):  
Antonios Alexos ◽  
Sotirios Chatzis

In this paper we address the understanding of the problem, of why a deep learning model decides that an individual is eligible for a loan or not. Here we propose a novel approach for inferring, which attributes matter the most, for making a decision in each specific individual case. Specifically we leverage concepts from neural attention to devise a novel feature wise attention mechanism. As we show, using real world datasets, our approach offers unique insights into the importance of various features, by producing a decision explanation for each specific loan case. At the same time, we observe that our novel mechanism, generates decisions which are much closer to the decisions generated by human experts, compared to the existent competitors.


2020 ◽  
Author(s):  
Mikel Joaristi

Unsupervised Graph Representation Learning methods learn a numerical representation of the nodes in a graph. The generated representations encode meaningful information about the nodes' properties, making them a powerful tool for tasks in many areas of study, such as social sciences, biology or communication networks. These methods are particularly interesting because they facilitate the direct use of standard Machine Learning models on graphs. Graph representation learning methods can be divided into two main categories depending on the information they encode, methods preserving the nodes connectivity information, and methods preserving nodes' structural information. Connectivity-based methods focus on encoding relationships between nodes, with neighboring nodes being closer together in the resulting latent space. On the other hand, structure-based methods generate a latent space where nodes serving a similar structural function in the network are encoded close to each other, independently of them being connected or even close to each other in the graph. While there are a lot of works that focus on preserving nodes' connectivity information, only a few works study the problem of encoding nodes' structure, specially in an unsupervised way. In this dissertation, we demonstrate that properly encoding nodes' structural information is fundamental for many real-world applications, as it can be leveraged to successfully solve many tasks where connectivity-based methods fail. One concrete example is presented first. In this example, the task consists of detecting malicious entities in a real-world financial network. We show that to solve this problem, connectivity information is not enough and show how leveraging structural information provides considerable performance improvements. This particular example pinpoints the need for further research on the area of structural graph representation learning, together with the limitations of the previous state-of-the-art. We use the acquired knowledge as a starting point and inspiration for the research and development of three independent unsupervised structural graph representation learning methods: Structural Iterative Representation learning approach for Graph Nodes (SIR-GN), Structural Iterative Lexicographic Autoencoded Node Representation (SILA), and Sparse Structural Node Representation (SparseStruct). We show how each of our methods tackles specific limitations on the previous state-of-the-art on structural graph representation learning such as scalability, representation meaning, and lack of formal proof that guarantees the preservation of structural properties. We provide an extensive experimental section where we compare our three proposed methods to the current state-of-the-art on both connectivity-based and structure-based representation learning methods. Finally, in this dissertation, we look at extensions of the basic structural graph representation learning problem. We study the problem of temporal structural graph representation. We also provide a method for representation explainability.


Author(s):  
Hai-Feng Guo ◽  
Lixin Han ◽  
Shoubao Su ◽  
Zhou-Bao Sun

Multi-Instance Multi-Label learning (MIML) is a popular framework for supervised classification where an example is described by multiple instances and associated with multiple labels. Previous MIML approaches have focused on predicting labels for instances. The idea of tackling the problem is to identify its equivalence in the traditional supervised learning framework. Motivated by the recent advancement in deep learning, in this paper, we still consider the problem of predicting labels and attempt to model deep learning in MIML learning framework. The proposed approach enables us to train deep convolutional neural network with images from social networks where images are well labeled, even labeled with several labels or uncorrelated labels. Experiments on real-world datasets demonstrate the effectiveness of our proposed approach.


Author(s):  
Liang Hu ◽  
Songlei Jian ◽  
Longbing Cao ◽  
Zhiping Gu ◽  
Qingkui Chen ◽  
...  

Classic recommender systems face challenges in addressing the data sparsity and cold-start problems with only modeling the user-item relation. An essential direction is to incorporate and understand the additional heterogeneous relations, e.g., user-user and item-item relations, since each user-item interaction is often influenced by other users and items, which form the user’s/item’s influential contexts. This induces important yet challenging issues, including modeling heterogeneous relations, interactions, and the strength of the influence from users/items in the influential contexts. To this end, we design Influential-Context Aggregation Units (ICAU) to aggregate the user-user/item-item relations within a given context as the influential context embeddings. Accordingly, we propose a Heterogeneous relations-Embedded Recommender System (HERS) based on ICAUs to model and interpret the underlying motivation of user-item interactions by considering user-user and item-item influences. The experiments on two real-world datasets show the highly improved recommendation quality made by HERS and its superiority in handling the cold-start problem. In addition, we demonstrate the interpretability of modeling influential contexts in explaining the recommendation results.


Sensors ◽  
2022 ◽  
Vol 22 (2) ◽  
pp. 532
Author(s):  
Vedhus Hoskere ◽  
Yasutaka Narazaki ◽  
Billie F. Spencer

Manual visual inspection of civil infrastructure is high-risk, subjective, and time-consuming. The success of deep learning and the proliferation of low-cost consumer robots has spurred rapid growth in research and application of autonomous inspections. The major components of autonomous inspection include data acquisition, data processing, and decision making, which are usually studied independently. However, for robust real-world applicability, these three aspects of the overall process need to be addressed concurrently with end-to-end testing, incorporating scenarios such as variations in structure type, color, damage level, camera distance, view angle, lighting, etc. Developing real-world datasets that span all these scenarios is nearly impossible. In this paper, we propose a framework to create a virtual visual inspection testbed using 3D synthetic environments that can enable end-to-end testing of autonomous inspection strategies. To populate the 3D synthetic environment with virtual damaged buildings, we propose the use of a non-linear finite element model to inform the realistic and automated visual rendering of different damage types, the damage state, and the material textures of what are termed herein physics-based graphics models (PBGMs). To demonstrate the benefits of the autonomous inspection testbed, three experiments are conducted with models of earthquake damaged reinforced concrete buildings. First, we implement the proposed framework to generate a new large-scale annotated benchmark dataset for post-earthquake inspections of buildings termed QuakeCity. Second, we demonstrate the improved performance of deep learning models trained using the QuakeCity dataset for inference on real data. Finally, a comparison of deep learning-based damage state estimation for different data acquisition strategies is carried out. The results demonstrate the use of PBGMs as an effective testbed for the development and validation of strategies for autonomous vision-based inspections of civil infrastructure.


2020 ◽  
Author(s):  
Aswani K ◽  
Menaka D

Abstract IntroductionThe brain tumor is the growth of abnormal cells inside the brain. These cells can be grown into malignant or benign tumors. Segmentation of tumor from MRI images using image processing techniques started decades back. Image processing based brain tumor segmentation can be divided in to three categories conventional image processing methods, Machine Learning methods and Deep Learning methods. Conventional methods lacks the accuracy in segmentation due to complex spatial variation of tumor. Machine Learning methods stand as a good alternative to conventional methods. Methods like SVM, KNN, Fuzzy and a combination of either of these provide good accuracy with reasonable processing speed. The difficulty in processing the various feature extraction methods and maintain accuracy as per the medical standards still exist as a limitation for machine learning methods. In Deep Learning features are extracted automatically in various stages of the network and maintain accuracy as per the medical standards. Huge database requirement and high computational time is still poses a problem for deep learning. MethodTo overcome the limitations specified above we propose an unsupervised dual autoencoder with latent space optimization here. The model require only normal MRI images for its training thus reducing the huge tumor database requirement. With a set of normal class data, an autoencoder can reproduce the feature vector into an output layer. This trained autoencoder works well with normal data while it fails to reproduce an anomaly to the output layer. But a classical autoencoder suffer due to poor latent space optimization. The Latent space loss of classical autoencoder is reduced using an auxiliary encoder along with the feature optimization based on Singular value Decomposition (SVD). The patches used for training are not traditional square patches but we took both horizontal and vertical patches to keep both local and global appearance features on the training set. An Autoencoder is applied separately for learning both horizontal and vertical patches. While training a logistic sigmoid transfer function is used for both encoder and decoder parts. SGD optimizer is used for optimization with an initial learning rate of .001 and the maximum epochs used are 4000. The network is trained in MATLAB 2018a with a processor capacity of 3.7 GHz with NVIDIA GPU and 16 GB of RAM.ResultsThe results are obtained using a patch size of 16x64, 64x16 for horizontal and vertical patches respectively. In Glioma images tumor is not grown from a point rather it spreads randomly. Region filling and connectivity operations are performed to get the final tumor segmentation. Overall the method segments Meningioma better than Gliomas. Three evaluation metrics are considered to measure the performance of the proposed system such as Dice Similarity Coefficient (DSC), Positive Predictive Value (PPV), and Sensitivity.ConclusionAn unsupervised method for the segmentation of brain tumor from MRI images is proposed here. The proposed dual autoencoder with SVD based feature optimization reduce the latent space loss in the classical autoencoder. The proposed method have advantages in computational efficiency, no need of huge database requirement and better accuracy than machine learning methods. The method is compared Machine Learning methods Like SVM, KNN and supervised deep learning methods like CNN and commentable results are obtained.


Author(s):  
Zheng Liu ◽  
Yu Xing ◽  
Fangzhao Wu ◽  
Mingxiao An ◽  
Xing Xie

Deep learning techniques have been widely applied to modern recommendation systems, bringing in flexible and effective ways of user representation. Conventionally, user representations are generated purely in the offline stage. Without referencing to the specific candidate item for recommendation, it is difficult to fully capture user preference from the perspective of interest. More recent algorithms tend to generate user representation at runtime, where user's historical behaviors are attentively summarized w.r.t. the presented candidate item. In spite of the improved efficacy, it is too expensive for many real-world scenarios because of the repetitive access to user's entire history. In this work, a novel user representation framework, Hi-Fi Ark, is proposed. With Hi-Fi Ark, user history is summarized into highly compact and complementary vectors in the offline stage, known as archives. Meanwhile, user preference towards a specific candidate item can be precisely captured via the attentive aggregation of such archives. As a result, both deployment feasibility and superior recommendation efficacy are achieved by Hi-Fi Ark. The effectiveness of Hi-Fi Ark is empirically validated on three real-world datasets, where remarkable and consistent improvements are made over a variety of well-recognized baseline methods.


Sign in / Sign up

Export Citation Format

Share Document