Evaluation of unsupervised optical flow methods for deep learning in real world datasets

2019 ◽  
Author(s):  
Diego Marez ◽  
Josh Harguess
Sensors ◽  
2021 ◽  
Vol 21 (19) ◽  
pp. 6661
Author(s):  
Lars Schmarje ◽  
Johannes Brünger ◽  
Monty Santarossa ◽  
Simon-Martin Schröder ◽  
Rainer Kiko ◽  
...  

Deep learning has been successfully applied to many classification problems including underwater challenges. However, a long-standing issue with deep learning is the need for large and consistently labeled datasets. Although current approaches in semi-supervised learning can decrease the required amount of annotated data by a factor of 10 or even more, this line of research still uses distinct classes. For underwater classification, and uncurated real-world datasets in general, clean class boundaries can often not be given due to a limited information content in the images and transitional stages of the depicted objects. This leads to different experts having different opinions and thus producing fuzzy labels which could also be considered ambiguous or divergent. We propose a novel framework for handling semi-supervised classifications of such fuzzy labels. It is based on the idea of overclustering to detect substructures in these fuzzy labels. We propose a novel loss to improve the overclustering capability of our framework and show the benefit of overclustering for fuzzy labels. We show that our framework is superior to previous state-of-the-art semi-supervised methods when applied to real-world plankton data with fuzzy labels. Moreover, we acquire 5 to 10% more consistent predictions of substructures.


Author(s):  
Antonios Alexos ◽  
Sotirios Chatzis

In this paper we address the understanding of the problem, of why a deep learning model decides that an individual is eligible for a loan or not. Here we propose a novel approach for inferring, which attributes matter the most, for making a decision in each specific individual case. Specifically we leverage concepts from neural attention to devise a novel feature wise attention mechanism. As we show, using real world datasets, our approach offers unique insights into the importance of various features, by producing a decision explanation for each specific loan case. At the same time, we observe that our novel mechanism, generates decisions which are much closer to the decisions generated by human experts, compared to the existent competitors.


Author(s):  
Jinhuan Liu ◽  
Xuemeng Song ◽  
Zhaochun Ren ◽  
Liqiang Nie ◽  
Zhaopeng Tu ◽  
...  

In recent years, there has been a growing interest in the fashion analysis (e.g., clothing matching) due to the huge economic value of the fashion industry. The essential problem is to model the compatibility between the complementary fashion items, such as the top and bottom in clothing matching. The majority of existing work on fashion analysis has focused on measuring the item-item compatibility in a latent space with deep learning methods. In this work, we aim to improve the compatibility modeling by sketching a compatible template for a given item as an auxiliary link between fashion items. Specifically, we propose an end-to-end Auxiliary Template-enhanced Generative Compatibility Modeling (AT-GCM) scheme, which introduces an auxiliary complementary template generation network equipped with the pixel-wise consistency and compatible template regularization. Extensive experiments on two real-world datasets demonstrate the superiority of the proposed approach.


Author(s):  
Hai-Feng Guo ◽  
Lixin Han ◽  
Shoubao Su ◽  
Zhou-Bao Sun

Multi-Instance Multi-Label learning (MIML) is a popular framework for supervised classification where an example is described by multiple instances and associated with multiple labels. Previous MIML approaches have focused on predicting labels for instances. The idea of tackling the problem is to identify its equivalence in the traditional supervised learning framework. Motivated by the recent advancement in deep learning, in this paper, we still consider the problem of predicting labels and attempt to model deep learning in MIML learning framework. The proposed approach enables us to train deep convolutional neural network with images from social networks where images are well labeled, even labeled with several labels or uncorrelated labels. Experiments on real-world datasets demonstrate the effectiveness of our proposed approach.


Sensors ◽  
2022 ◽  
Vol 22 (2) ◽  
pp. 532
Author(s):  
Vedhus Hoskere ◽  
Yasutaka Narazaki ◽  
Billie F. Spencer

Manual visual inspection of civil infrastructure is high-risk, subjective, and time-consuming. The success of deep learning and the proliferation of low-cost consumer robots has spurred rapid growth in research and application of autonomous inspections. The major components of autonomous inspection include data acquisition, data processing, and decision making, which are usually studied independently. However, for robust real-world applicability, these three aspects of the overall process need to be addressed concurrently with end-to-end testing, incorporating scenarios such as variations in structure type, color, damage level, camera distance, view angle, lighting, etc. Developing real-world datasets that span all these scenarios is nearly impossible. In this paper, we propose a framework to create a virtual visual inspection testbed using 3D synthetic environments that can enable end-to-end testing of autonomous inspection strategies. To populate the 3D synthetic environment with virtual damaged buildings, we propose the use of a non-linear finite element model to inform the realistic and automated visual rendering of different damage types, the damage state, and the material textures of what are termed herein physics-based graphics models (PBGMs). To demonstrate the benefits of the autonomous inspection testbed, three experiments are conducted with models of earthquake damaged reinforced concrete buildings. First, we implement the proposed framework to generate a new large-scale annotated benchmark dataset for post-earthquake inspections of buildings termed QuakeCity. Second, we demonstrate the improved performance of deep learning models trained using the QuakeCity dataset for inference on real data. Finally, a comparison of deep learning-based damage state estimation for different data acquisition strategies is carried out. The results demonstrate the use of PBGMs as an effective testbed for the development and validation of strategies for autonomous vision-based inspections of civil infrastructure.


Author(s):  
Zheng Liu ◽  
Yu Xing ◽  
Fangzhao Wu ◽  
Mingxiao An ◽  
Xing Xie

Deep learning techniques have been widely applied to modern recommendation systems, bringing in flexible and effective ways of user representation. Conventionally, user representations are generated purely in the offline stage. Without referencing to the specific candidate item for recommendation, it is difficult to fully capture user preference from the perspective of interest. More recent algorithms tend to generate user representation at runtime, where user's historical behaviors are attentively summarized w.r.t. the presented candidate item. In spite of the improved efficacy, it is too expensive for many real-world scenarios because of the repetitive access to user's entire history. In this work, a novel user representation framework, Hi-Fi Ark, is proposed. With Hi-Fi Ark, user history is summarized into highly compact and complementary vectors in the offline stage, known as archives. Meanwhile, user preference towards a specific candidate item can be precisely captured via the attentive aggregation of such archives. As a result, both deployment feasibility and superior recommendation efficacy are achieved by Hi-Fi Ark. The effectiveness of Hi-Fi Ark is empirically validated on three real-world datasets, where remarkable and consistent improvements are made over a variety of well-recognized baseline methods.


Author(s):  
Christian Lagemann ◽  
Michael Klaas ◽  
Wolfgang Schröder

Convolutional neural networks have been successfully used in a variety of tasks and recently have been adapted to improve processing steps in Particle-Image Velocimetry (PIV). Recurrent All-Pairs Fields Transforms (RAFT) as an optical flow estimation backbone achieve a new state-of-the-art accuracy on public synthetic PIV datasets, generalize well to unknown real-world experimental data, and allow a significantly higher spatial resolution compared to state-of-the-art PIV algorithms based on cross-correlation methods. However, the huge diversity in dynamic flows and varying particle image conditions require PIV processing schemes to have high generalization capabilities to unseen flow and lighting conditions. If these conditions vary strongly compared to the synthetic training data, the performance of fully supervised learning based PIV tools might degrade. To tackle these issues, our training procedure is augmented by an unsupervised learning paradigm which remedy the need of a general synthetic dataset and theoretically boosts the inference capability of a deep learning model in a way being more relevant to challenging real-world experimental data. Therefore, we propose URAFT-PIV, an unsupervised deep neural network architecture for optical flow estimation in PIV applications and show that our combination of state-of-the-art deep learning pipelines and unsupervised learning achieves a new state-of-the-art accuracy for unsupervised PIV networks while performing similar to supervisedly trained LiteFlowNet based competitors. Furthermore, we show that URAFT-PIV also performs well under more challenging flow field and image conditions such as low particle density and changing light conditions and demonstrate its generalization capability based on an outof-the-box application to real-world experimental data. Our tests also suggest that current state-of-the-art loss functions might be a limiting factor for the performance of unsupervised optical flow estimation.


Author(s):  
Peng Hu ◽  
Rong Du ◽  
Yao Hu ◽  
Nan Li

Nowadays, item-item recommendation plays an important role in modern recommender systems. Traditionally, this is either solved by behavior-based collaborative filtering or content-based meth- ods. However, both kinds of methods often suffer from cold-start problems, or poor performance due to few behavior supervision; and hybrid methods which can leverage the strength of both kinds of methods are needed. In this paper, we propose a semi-parametric embedding framework for this problem. Specifically, the embedding of an item is composed of two parts, i.e., the parametric part from content information and the non-parametric part designed to encode behavior information; meanwhile, a deep learning algorithm is proposed to learn two parts simultaneously. Extensive experiments on real-world datasets demonstrate the effectiveness and robustness of the proposed method.


2021 ◽  
Vol 21 (3) ◽  
pp. 1-17
Author(s):  
Wu Chen ◽  
Yong Yu ◽  
Keke Gai ◽  
Jiamou Liu ◽  
Kim-Kwang Raymond Choo

In existing ensemble learning algorithms (e.g., random forest), each base learner’s model needs the entire dataset for sampling and training. However, this may not be practical in many real-world applications, and it incurs additional computational costs. To achieve better efficiency, we propose a decentralized framework: Multi-Agent Ensemble. The framework leverages edge computing to facilitate ensemble learning techniques by focusing on the balancing of access restrictions (small sub-dataset) and accuracy enhancement. Specifically, network edge nodes (learners) are utilized to model classifications and predictions in our framework. Data is then distributed to multiple base learners who exchange data via an interaction mechanism to achieve improved prediction. The proposed approach relies on a training model rather than conventional centralized learning. Findings from the experimental evaluations using 20 real-world datasets suggest that Multi-Agent Ensemble outperforms other ensemble approaches in terms of accuracy even though the base learners require fewer samples (i.e., significant reduction in computation costs).


Animals ◽  
2021 ◽  
Vol 11 (6) ◽  
pp. 1549
Author(s):  
Robert D. Chambers ◽  
Nathanael C. Yoder ◽  
Aletha B. Carson ◽  
Christian Junge ◽  
David E. Allen ◽  
...  

Collar-mounted canine activity monitors can use accelerometer data to estimate dog activity levels, step counts, and distance traveled. With recent advances in machine learning and embedded computing, much more nuanced and accurate behavior classification has become possible, giving these affordable consumer devices the potential to improve the efficiency and effectiveness of pet healthcare. Here, we describe a novel deep learning algorithm that classifies dog behavior at sub-second resolution using commercial pet activity monitors. We built machine learning training databases from more than 5000 videos of more than 2500 dogs and ran the algorithms in production on more than 11 million days of device data. We then surveyed project participants representing 10,550 dogs, which provided 163,110 event responses to validate real-world detection of eating and drinking behavior. The resultant algorithm displayed a sensitivity and specificity for detecting drinking behavior (0.949 and 0.999, respectively) and eating behavior (0.988, 0.983). We also demonstrated detection of licking (0.772, 0.990), petting (0.305, 0.991), rubbing (0.729, 0.996), scratching (0.870, 0.997), and sniffing (0.610, 0.968). We show that the devices’ position on the collar had no measurable impact on performance. In production, users reported a true positive rate of 95.3% for eating (among 1514 users), and of 94.9% for drinking (among 1491 users). The study demonstrates the accurate detection of important health-related canine behaviors using a collar-mounted accelerometer. We trained and validated our algorithms on a large and realistic training dataset, and we assessed and confirmed accuracy in production via user validation.


Sign in / Sign up

Export Citation Format

Share Document