scholarly journals Tracking of Deformable Objects Using Dynamically and Robustly Updating Pictorial Structures

2020 ◽  
Vol 6 (7) ◽  
pp. 61
Author(s):  
Connor Charles Ratcliffe ◽  
Ognjen Arandjelović

The problem posed by complex, articulated or deformable objects has been at the focus of much tracking research for a considerable length of time. However, it remains a major challenge, fraught with numerous difficulties. The increased ubiquity of technology in all realms of our society has made the need for effective solutions all the more urgent. In this article, we describe a novel method which systematically addresses the aforementioned difficulties and in practice outperforms the state of the art. Global spatial flexibility and robustness to deformations are achieved by adopting a pictorial structure based geometric model, and localized appearance changes by a subspace based model of part appearance underlain by a gradient based representation. In addition to one-off learning of both the geometric constraints and part appearances, we introduce a continuing learning framework which implements information discounting i.e., the discarding of historical appearances in favour of the more recent ones. Moreover, as a means of ensuring robustness to transient occlusions (including self-occlusions), we propose a solution for detecting unlikely appearance changes which allows for unreliable data to be rejected. A comprehensive evaluation of the proposed method, the analysis and discussing of findings, and a comparison with several state-of-the-art methods demonstrates the major superiority of our algorithm.

Author(s):  
Ruijing Yang ◽  
Ziyu Guan ◽  
Zitong Yu ◽  
Xiaoyi Feng ◽  
Jinye Peng ◽  
...  

Automatic pain recognition is paramount for medical diagnosis and treatment. The existing works fall into three categories: assessing facial appearance changes, exploiting physiological cues, or fusing them in a multi-modal manner. However, (1) appearance changes are easily affected by subjective factors which impedes objective pain recognition. Besides, the appearance-based approaches ignore long-range spatial-temporal dependencies that are important for modeling expressions over time; (2) the physiological cues are obtained by attaching sensors on human body, which is inconvenient and uncomfortable. In this paper, we present a novel multi-task learning framework which encodes both appearance changes and physiological cues in a non-contact manner for pain recognition. The framework is able to capture both local and long-range dependencies via the proposed attention mechanism for the learned appearance representations, which are further enriched by temporally attended physiological cues (remote photoplethysmography, rPPG) that are recovered from videos in the auxiliary task. This framework is dubbed rPPG-enriched Spatio-Temporal Attention Network (rSTAN) and allows us to establish the state-of-the-art performance of non-contact pain recognition on publicly available pain databases. It demonstrates that rPPG predictions can be used as an auxiliary task to facilitate non-contact automatic pain recognition.


2020 ◽  
Vol 41 (Supplement_2) ◽  
Author(s):  
S Rao ◽  
Y Li ◽  
R Ramakrishnan ◽  
A Hassaine ◽  
D Canoy ◽  
...  

Abstract Background/Introduction Predicting incident heart failure has been challenging. Deep learning models when applied to rich electronic health records (EHR) offer some theoretical advantages. However, empirical evidence for their superior performance is limited and they remain commonly uninterpretable, hampering their wider use in medical practice. Purpose We developed a deep learning framework for more accurate and yet interpretable prediction of incident heart failure. Methods We used longitudinally linked EHR from practices across England, involving 100,071 patients, 13% of whom had been diagnosed with incident heart failure during follow-up. We investigated the predictive performance of a novel transformer deep learning model, “Transformer for Heart Failure” (BEHRT-HF), and validated it using both an external held-out dataset and an internal five-fold cross-validation mechanism using area under receiver operating characteristic (AUROC) and area under the precision recall curve (AUPRC). Predictor groups included all outpatient and inpatient diagnoses within their temporal context, medications, age, and calendar year for each encounter. By treating diagnoses as anchors, we alternatively removed different modalities (ablation study) to understand the importance of individual modalities to the performance of incident heart failure prediction. Using perturbation-based techniques, we investigated the importance of associations between selected predictors and heart failure to improve model interpretability. Results BEHRT-HF achieved high accuracy with AUROC 0.932 and AUPRC 0.695 for external validation, and AUROC 0.933 (95% CI: 0.928, 0.938) and AUPRC 0.700 (95% CI: 0.682, 0.718) for internal validation. Compared to the state-of-the-art recurrent deep learning model, RETAIN-EX, BEHRT-HF outperformed it by 0.079 and 0.030 in terms of AUPRC and AUROC. Ablation study showed that medications were strong predictors, and calendar year was more important than age. Utilising perturbation, we identified and ranked the intensity of associations between diagnoses and heart failure. For instance, the method showed that established risk factors including myocardial infarction, atrial fibrillation and flutter, and hypertension all strongly associated with the heart failure prediction. Additionally, when population was stratified into different age groups, incident occurrence of a given disease had generally a higher contribution to heart failure prediction in younger ages than when diagnosed later in life. Conclusions Our state-of-the-art deep learning framework outperforms the predictive performance of existing models whilst enabling a data-driven way of exploring the relative contribution of a range of risk factors in the context of other temporal information. Funding Acknowledgement Type of funding source: Private grant(s) and/or Sponsorship. Main funding source(s): National Institute for Health Research, Oxford Martin School, Oxford Biomedical Research Centre


Electronics ◽  
2021 ◽  
Vol 10 (10) ◽  
pp. 1136
Author(s):  
David Augusto Ribeiro ◽  
Juan Casavílca Silva ◽  
Renata Lopes Rosa ◽  
Muhammad Saadi ◽  
Shahid Mumtaz ◽  
...  

Light field (LF) imaging has multi-view properties that help to create many applications that include auto-refocusing, depth estimation and 3D reconstruction of images, which are required particularly for intelligent transportation systems (ITSs). However, cameras can present a limited angular resolution, becoming a bottleneck in vision applications. Thus, there is a challenge to incorporate angular data due to disparities in the LF images. In recent years, different machine learning algorithms have been applied to both image processing and ITS research areas for different purposes. In this work, a Lightweight Deformable Deep Learning Framework is implemented, in which the problem of disparity into LF images is treated. To this end, an angular alignment module and a soft activation function into the Convolutional Neural Network (CNN) are implemented. For performance assessment, the proposed solution is compared with recent state-of-the-art methods using different LF datasets, each one with specific characteristics. Experimental results demonstrated that the proposed solution achieved a better performance than the other methods. The image quality results obtained outperform state-of-the-art LF image reconstruction methods. Furthermore, our model presents a lower computational complexity, decreasing the execution time.


Author(s):  
Giridhar Reddy ◽  
Jonathan Cagan

Abstract A method for the design of truss structures which encourages lateral exploration, pushes away from violated spaces, models design intentions, and produces solutions with a wide variety of characteristics is introduced. An improved shape annealing algorithm for truss topology generation and optimization, based on the techniques of shape grammars and simulated annealing, implements the method. The algorithm features a shape grammar to model design intentions, an ability to incorporate geometric constraints to avoid obstacles, and a shape optimization method using only simulated annealing with more consistent convergence characteristics; no traditional gradient-based techniques are employed. The improved algorithm is illustrated on various structural examples generating a variety of solutions based on a simple grammar.


2021 ◽  
Vol 14 (5) ◽  
pp. 785-798
Author(s):  
Daokun Hu ◽  
Zhiwen Chen ◽  
Jianbing Wu ◽  
Jianhua Sun ◽  
Hao Chen

Persistent memory (PM) is increasingly being leveraged to build hash-based indexing structures featuring cheap persistence, high performance, and instant recovery, especially with the recent release of Intel Optane DC Persistent Memory Modules. However, most of them are evaluated on DRAM-based emulators with unreal assumptions, or focus on the evaluation of specific metrics with important properties sidestepped. Thus, it is essential to understand how well the proposed hash indexes perform on real PM and how they differentiate from each other if a wider range of performance metrics are considered. To this end, this paper provides a comprehensive evaluation of persistent hash tables. In particular, we focus on the evaluation of six state-of-the-art hash tables including Level hashing, CCEH, Dash, PCLHT, Clevel, and SOFT, with real PM hardware. Our evaluation was conducted using a unified benchmarking framework and representative workloads. Besides characterizing common performance properties, we also explore how hardware configurations (such as PM bandwidth, CPU instructions, and NUMA) affect the performance of PM-based hash tables. With our in-depth analysis, we identify design trade-offs and good paradigms in prior arts, and suggest desirable optimizations and directions for the future development of PM-based hash tables.


2020 ◽  
Vol 14 (4) ◽  
pp. 653-667
Author(s):  
Laxman Dhulipala ◽  
Changwan Hong ◽  
Julian Shun

Connected components is a fundamental kernel in graph applications. The fastest existing multicore algorithms for solving graph connectivity are based on some form of edge sampling and/or linking and compressing trees. However, many combinations of these design choices have been left unexplored. In this paper, we design the ConnectIt framework, which provides different sampling strategies as well as various tree linking and compression schemes. ConnectIt enables us to obtain several hundred new variants of connectivity algorithms, most of which extend to computing spanning forest. In addition to static graphs, we also extend ConnectIt to support mixes of insertions and connectivity queries in the concurrent setting. We present an experimental evaluation of ConnectIt on a 72-core machine, which we believe is the most comprehensive evaluation of parallel connectivity algorithms to date. Compared to a collection of state-of-the-art static multicore algorithms, we obtain an average speedup of 12.4x (2.36x average speedup over the fastest existing implementation for each graph). Using ConnectIt, we are able to compute connectivity on the largest publicly-available graph (with over 3.5 billion vertices and 128 billion edges) in under 10 seconds using a 72-core machine, providing a 3.1x speedup over the fastest existing connectivity result for this graph, in any computational setting. For our incremental algorithms, we show that our algorithms can ingest graph updates at up to several billion edges per second. To guide the user in selecting the best variants in ConnectIt for different situations, we provide a detailed analysis of the different strategies. Finally, we show how the techniques in ConnectIt can be used to speed up two important graph applications: approximate minimum spanning forest and SCAN clustering.


Author(s):  
Jie Yang ◽  
Zhiquan Qi ◽  
Yong Shi

This paper develops a multi-task learning framework that attempts to incorporate the image structure knowledge to assist image inpainting, which is not well explored in previous works. The primary idea is to train a shared generator to simultaneously complete the corrupted image and corresponding structures --- edge and gradient, thus implicitly encouraging the generator to exploit relevant structure knowledge while inpainting. In the meantime, we also introduce a structure embedding scheme to explicitly embed the learned structure features into the inpainting process, thus to provide possible preconditions for image completion. Specifically, a novel pyramid structure loss is proposed to supervise structure learning and embedding. Moreover, an attention mechanism is developed to further exploit the recurrent structures and patterns in the image to refine the generated structures and contents. Through multi-task learning, structure embedding besides with attention, our framework takes advantage of the structure knowledge and outperforms several state-of-the-art methods on benchmark datasets quantitatively and qualitatively.


Author(s):  
Cong Fei ◽  
Bin Wang ◽  
Yuzheng Zhuang ◽  
Zongzhang Zhang ◽  
Jianye Hao ◽  
...  

Generative adversarial imitation learning (GAIL) has shown promising results by taking advantage of generative adversarial nets, especially in the field of robot learning. However, the requirement of isolated single modal demonstrations limits the scalability of the approach to real world scenarios such as autonomous vehicles' demand for a proper understanding of human drivers' behavior. In this paper, we propose a novel multi-modal GAIL framework, named Triple-GAIL, that is able to learn skill selection and imitation jointly from both expert demonstrations and continuously generated experiences with data augmentation purpose by introducing an auxiliary selector. We provide theoretical guarantees on the convergence to optima for both of the generator and the selector respectively. Experiments on real driver trajectories and real-time strategy game datasets demonstrate that Triple-GAIL can better fit multi-modal behaviors close to the demonstrators and outperforms state-of-the-art methods.


Author(s):  
Lixin Fan ◽  
Kam Woh Ng ◽  
Ce Ju ◽  
Tianyu Zhang ◽  
Chee Seng Chan

This paper proposes a novel deep polarized network (DPN) for learning to hash, in which each channel in the network outputs is pushed far away from zero by employing a differentiable bit-wise hinge-like loss which is dubbed as polarization loss. Reformulated within a generic Hamming Distance Metric Learning framework [Norouzi et al., 2012], the proposed polarization loss bypasses the requirement to prepare pairwise labels for (dis-)similar items and, yet, the proposed loss strictly bounds from above the pairwise Hamming Distance based losses. The intrinsic connection between pairwise and pointwise label information, as disclosed in this paper, brings about the following methodological improvements: (a) we may directly employ the proposed differentiable polarization loss with no large deviations incurred from the target Hamming distance based loss; and (b) the subtask of assigning binary codes becomes extremely simple --- even random codes assigned to each class suffice to result in state-of-the-art performances, as demonstrated in CIFAR10, NUS-WIDE and ImageNet100 datasets.


Author(s):  
Hongzuo Xu ◽  
Yongjun Wang ◽  
Zhiyue Wu ◽  
Yijie Wang

Non-IID categorical data is ubiquitous and common in realworld applications. Learning various kinds of couplings has been proved to be a reliable measure when detecting outliers in such non-IID data. However, it is a critical yet challenging problem to model, represent, and utilise high-order complex value couplings. Existing outlier detection methods normally only focus on pairwise primary value couplings and fail to uncover real relations that hide in complex couplings, resulting in suboptimal and unstable performance. This paper introduces a novel unsupervised embedding-based complex value coupling learning framework EMAC and its instance SCAN to address these issues. SCAN first models primary value couplings. Then, coupling bias is defined to capture complex value couplings with different granularities and highlight the essence of outliers. An embedding method is performed on the value network constructed via biased value couplings, which further learns high-order complex value couplings and embeds these couplings into a value representation matrix. Bidirectional selective value coupling learning is proposed to show how to estimate value and object outlierness through value couplings. Substantial experiments show that SCAN (i) significantly outperforms five state-of-the-art outlier detection methods on thirteen real-world datasets; and (ii) has much better resilience to noise than its competitors.


Sign in / Sign up

Export Citation Format

Share Document