scholarly journals 3D Convolutional Neural Networks for Remote Pulse Rate Measurement and Mapping from Facial Video

2019 ◽  
Vol 9 (20) ◽  
pp. 4364 ◽  
Author(s):  
Frédéric Bousefsaf ◽  
Alain Pruski ◽  
Choubeila Maaoui

Remote pulse rate measurement from facial video has gained particular attention over the last few years. Research exhibits significant advancements and demonstrates that common video cameras correspond to reliable devices that can be employed to measure a large set of biomedical parameters without any contact with the subject. A new framework for measuring and mapping pulse rate from video is presented in this pilot study. The method, which relies on convolutional 3D networks, is fully automatic and does not require any special image preprocessing. In addition, the network ensures concurrent mapping by producing a prediction for each local group of pixels. A particular training procedure that employs only synthetic data is proposed. Preliminary results demonstrate that this convolutional 3D network can effectively extract pulse rate from video without the need for any processing of frames. The trained model was compared with other state-of-the-art methods on public data. Results exhibit significant agreement between estimated and ground-truth measurements: the root mean square error computed from pulse rate values assessed with the convolutional 3D network is equal to 8.64 bpm, which is superior to 10 bpm for the other state-of-the-art methods. The robustness of the method to natural motion and increases in performance correspond to the two main avenues that will be considered in future works.

2021 ◽  
Vol 22 (1) ◽  
Author(s):  
João Lobo ◽  
Rui Henriques ◽  
Sara C. Madeira

Abstract Background Three-way data started to gain popularity due to their increasing capacity to describe inherently multivariate and temporal events, such as biological responses, social interactions along time, urban dynamics, or complex geophysical phenomena. Triclustering, subspace clustering of three-way data, enables the discovery of patterns corresponding to data subspaces (triclusters) with values correlated across the three dimensions (observations $$\times$$ × features $$\times$$ × contexts). With increasing number of algorithms being proposed, effectively comparing them with state-of-the-art algorithms is paramount. These comparisons are usually performed using real data, without a known ground-truth, thus limiting the assessments. In this context, we propose a synthetic data generator, G-Tric, allowing the creation of synthetic datasets with configurable properties and the possibility to plant triclusters. The generator is prepared to create datasets resembling real 3-way data from biomedical and social data domains, with the additional advantage of further providing the ground truth (triclustering solution) as output. Results G-Tric can replicate real-world datasets and create new ones that match researchers needs across several properties, including data type (numeric or symbolic), dimensions, and background distribution. Users can tune the patterns and structure that characterize the planted triclusters (subspaces) and how they interact (overlapping). Data quality can also be controlled, by defining the amount of missing, noise or errors. Furthermore, a benchmark of datasets resembling real data is made available, together with the corresponding triclustering solutions (planted triclusters) and generating parameters. Conclusions Triclustering evaluation using G-Tric provides the possibility to combine both intrinsic and extrinsic metrics to compare solutions that produce more reliable analyses. A set of predefined datasets, mimicking widely used three-way data and exploring crucial properties was generated and made available, highlighting G-Tric’s potential to advance triclustering state-of-the-art by easing the process of evaluating the quality of new triclustering approaches.


Symmetry ◽  
2019 ◽  
Vol 11 (2) ◽  
pp. 227
Author(s):  
Eckart Michaelsen ◽  
Stéphane Vujasinovic

Representative input data are a necessary requirement for the assessment of machine-vision systems. For symmetry-seeing machines in particular, such imagery should provide symmetries as well as asymmetric clutter. Moreover, there must be reliable ground truth with the data. It should be possible to estimate the recognition performance and the computational efforts by providing different grades of difficulty and complexity. Recent competitions used real imagery labeled by human subjects with appropriate ground truth. The paper at hand proposes to use synthetic data instead. Such data contain symmetry, clutter, and nothing else. This is preferable because interference with other perceptive capabilities, such as object recognition, or prior knowledge, can be avoided. The data are given sparsely, i.e., as sets of primitive objects. However, images can be generated from them, so that the same data can also be fed into machines requiring dense input, such as multilayered perceptrons. Sparse representations are preferred, because the author’s own system requires such data, and in this way, any influence of the primitive extraction method is excluded. The presented format allows hierarchies of symmetries. This is important because hierarchy constitutes a natural and dominant part in symmetry-seeing. The paper reports some experiments using the author’s Gestalt algebra system as symmetry-seeing machine. Additionally included is a comparative test run with the state-of-the-art symmetry-seeing deep learning convolutional perceptron of the PSU. The computational efforts and recognition performance are assessed.


Author(s):  
Gaetano Rossiello ◽  
Alfio Gliozzo ◽  
Michael Glass

We propose a novel approach to learn representations of relations expressed by their textual mentions. In our assumption, if two pairs of entities belong to the same relation, then those two pairs are analogous. We collect a large set of analogous pairs by matching triples in knowledge bases with web-scale corpora through distant supervision. This dataset is adopted to train a hierarchical siamese network in order to learn entity-entity embeddings which encode relational information through the different linguistic paraphrasing expressing the same relation. The model can be used to generate pre-trained embeddings which provide a valuable signal when integrated into an existing neural-based model by outperforming the state-of-the-art methods on a relation extraction task.


2020 ◽  
Vol 38 (2) ◽  
pp. 276-292
Author(s):  
Khawla Asmi ◽  
Dounia Lotfi ◽  
Mohamed El Marraki

Purpose The state-of-the-art methods designed for overlapping community detection are limited by their high execution time as in CPM or the need to provide some parameters like the number of communities in Bigclam and Nise_sph, which is a nontrivial information. Hence, there is a need to develop the accuracy that represents the primordial goal, where the actual state-of-the-art methods do not succeed to achieve high correspondence with the ground truth for many instances of networks. The paper aims to discuss this issue. Design/methodology/approach The authors offer a new method that explore the union of all maximum spanning trees (UMST) and models the strength of links between nodes. Also, each node in the UMST is linked with its most similar neighbor. From this model, the authors extract local community for each node, and then they combine the produced communities according to their number of shared nodes. Findings The experiments on eight real-world data sets and four sets of artificial networks show that the proposed method achieves obvious improvements over four state-of-the-art (BigClam, OSLOM, Demon, SE, DMST and ST) methods in terms of the F-score and ONMI for the networks with ground truth (Amazon, Youtube, LiveJournal and Orkut). Also, for the other networks, it provides communities with a good overlapping modularity. Originality/value In this paper, the authors investigate the UMST for the overlapping community detection.


2021 ◽  
Vol 4 (1) ◽  
Author(s):  
Ananyananda Dasari ◽  
Sakthi Kumar Arul Prakash ◽  
László A. Jeni ◽  
Conrad S. Tucker

AbstractThis work investigates the estimation biases of remote photoplethysmography (rPPG) methods for pulse rate measurement across diverse demographics. Advances in photoplethysmography (PPG) and rPPG methods have enabled the development of contact and noncontact approaches for continuous monitoring and collection of patient health data. The contagious nature of viruses such as COVID-19 warrants noncontact methods for physiological signal estimation. However, these approaches are subject to estimation biases due to variations in environmental conditions and subject demographics. The performance of contact-based wearable sensors has been evaluated, using off-the-shelf devices across demographics. However, the measurement uncertainty of rPPG methods that estimate pulse rate has not been sufficiently tested across diverse demographic populations or environments. Quantifying the efficacy of rPPG methods in real-world conditions is critical in determining their potential viability as health monitoring solutions. Currently, publicly available face datasets accompanied by physiological measurements are typically captured in controlled laboratory settings, lacking diversity in subject skin tones, age, and cultural artifacts (e.g, bindi worn by Indian women). In this study, we collect pulse rate and facial video data from human subjects in India and Sierra Leone, in order to quantify the uncertainty in noncontact pulse rate estimation methods. The video data are used to estimate pulse rate using state-of-the-art rPPG camera-based methods, and compared against ground truth measurements captured using an FDA-approved contact-based pulse rate measurement device. Our study reveals that rPPG methods exhibit similar biases when compared with a contact-based device across demographic groups and environmental conditions. The mean difference between pulse rates measured by rPPG methods and the ground truth is found to be ~2% (1 beats per minute (b.p.m.)), signifying agreement of rPPG methods with the ground truth. We also find that rPPG methods show pulse rate variability of ~15% (11 b.p.m.), as compared to the ground truth. We investigate factors impacting rPPG methods and discuss solutions aimed at mitigating variance.


Author(s):  
Kuo-Liang Chung ◽  
Yu-Ling Tseng ◽  
Tzu-Hsien Chan ◽  
Ching-Sheng Wang

In this paper, we rst propose a fast and eective region-based depth map upsampling method, and then propose a joint upsampling and location map-free reversible data hiding method, simpled called the JUR method. In the proposed upsampling method, all the missing depth pixels are partitioned into three disjoint regions: the homogeneous, semi-homogeneous, and non- homogeneous regions. Then, we propose the depth copying, mean value, and bicubic interpolation approaches to reconstruct the three kinds of missing depth pixels quickly, respectively. In the proposed JUR method, without any location map overhead, using the neighboring ground truth depth pixels of each missing depth pixel, achieving substantial quality, and embedding capacity merits. The comprehensive experiments have been carried out to not only justify the execution-time and quality merits of the upsampled depth maps by our upsampling method relative to the state-of-the-art methods, but also justify the embedding capacity and quality merits of our JUR method when compared with the state-of-the-art methods.


Author(s):  
Xueliang Zhao ◽  
Chongyang Tao ◽  
Wei Wu ◽  
Can Xu ◽  
Dongyan Zhao ◽  
...  

We present a document-grounded matching network (DGMN) for response selection that can power a knowledge-aware retrieval-based chatbot system. The challenges of building such a model lie in how to ground conversation contexts with background documents and how to recognize important information in the documents for matching. To overcome the challenges, DGMN fuses information in a document and a context into representations of each other, and dynamically determines if grounding is necessary and importance of different parts of the document and the context through hierarchical interaction with a response at the matching step. Empirical studies on two public data sets indicate that DGMN can significantly improve upon state-of-the-art methods and at the same time enjoys good interpretability.


Author(s):  
Leonardo Lamanna ◽  
Alessandro Saetti ◽  
Luciano Serafini ◽  
Alfonso Gerevini ◽  
Paolo Traverso

The automated learning of action models is widely recognised as a key and compelling challenge to address the difficulties of the manual specification of planning domains. Most state-of-the-art methods perform this learning offline from an input set of plan traces generated by the execution of (successful) plans. However, how to generate informative plan traces for learning action models is still an open issue. Moreover, plan traces might not be available for a new environment. In this paper, we propose an algorithm for learning action models online, incrementally during the execution of plans. Such plans are generated to achieve goals that the algorithm decides online in order to obtain informative plan traces and reach states from which useful information can be learned. We show some fundamental theoretical properties of the algorithm, and we experimentally evaluate the online learning of the action models over a large set of IPC domains.


2019 ◽  
Vol 33 (13) ◽  
pp. 1950133 ◽  
Author(s):  
Mei Chen ◽  
Mei Zhang ◽  
Ming Li ◽  
Mingwei Leng ◽  
Zhichong Yang ◽  
...  

Detecting the natural communities in a real-world network can uncover its underlying structure and potential function. In this paper, a novel community algorithm SUM is introduced. The fundamental idea of SUM is that a node with relatively low degree stays faithful to its community, because it only has links with nodes in one community, while a node with relatively high degree not only has links with nodes within but also outside its community, and this may cause confusion when detecting communities. Based on this idea, SUM detects communities by suspecting the links of the maximum degree nodes to their neighbors within a community, and relying mainly on the nodes with relatively low degree simultaneously. SUM elegantly defines a similarity which takes into account both the commonality and the rejective degree of two adjacent nodes. After putting similar nodes into one community, SUM generates initial communities by reassigning the maximum degree nodes. Next, SUM assigns nodes without labels to the initial communities, and adjusts the border node to its most linked community. To evaluate the effectiveness of SUM, SUM is compared with seven baselines, including four classical and three state-of-the-art methods on a wide range of complex networks. On the small size networks with ground-truth community structures, results are visually demonstrated, as well as quantitatively measured with ARI, NMI and Modularity. On the relatively large size networks without ground-truth community structures, the performances of these algorithms are evaluated according to Modularity. Experimental results indicate that SUM can effectively determine community structures on small or relatively large size networks with high quality, and also outperforms the compared state-of-the-art methods.


Author(s):  
Jonas Hein ◽  
Matthias Seibold ◽  
Federica Bogo ◽  
Mazda Farshad ◽  
Marc Pollefeys ◽  
...  

Abstract Purpose:  Tracking of tools and surgical activity is becoming more and more important in the context of computer assisted surgery. In this work, we present a data generation framework, dataset and baseline methods to facilitate further research in the direction of markerless hand and instrument pose estimation in realistic surgical scenarios. Methods:  We developed a rendering pipeline to create inexpensive and realistic synthetic data for model pretraining. Subsequently, we propose a pipeline to capture and label real data with hand and object pose ground truth in an experimental setup to gather high-quality real data. We furthermore present three state-of-the-art RGB-based pose estimation baselines. Results:  We evaluate three baseline models on the proposed datasets. The best performing baseline achieves an average tool 3D vertex error of 16.7 mm on synthetic data as well as 13.8 mm on real data which is comparable to the state-of-the art in RGB-based hand/object pose estimation. Conclusion:  To the best of our knowledge, we propose the first synthetic and real data generation pipelines to generate hand and object pose labels for open surgery. We present three baseline models for RGB based object and object/hand pose estimation based on RGB frames. Our realistic synthetic data generation pipeline may contribute to overcome the data bottleneck in the surgical domain and can easily be transferred to other medical applications.


Sign in / Sign up

Export Citation Format

Share Document