Global Position Prediction for Interactive Motion Capture

Author(s):  
Paul Schreiner ◽  
Maksym Perepichka ◽  
Hayden Lewis ◽  
Sune Darkner ◽  
Paul G. Kry ◽  
...  

We present a method for reconstructing the global position of motion capture where position sensing is poor or unavailable. Capture systems, such as IMU suits, can provide excellent pose and orientation data of a capture subject, but otherwise need post processing to estimate global position. We propose a solution that trains a neural network to predict, in real-time, the height and body displacement given a short window of pose and orientation data. Our training dataset contains pre-recorded data with global positions from many different capture subjects, performing a wide variety of activities in order to broadly train a network to estimate on like and unseen activities. We compare training on two network architectures, a universal network (u-net) and a traditional convolutional neural network (CNN) - observing better error properties for the u-net in our results. We also evaluate our method for different classes of motion. We observe high quality results for motion examples with good representation in specialized datasets, while general performance appears better in a more broadly sampled dataset when input motions are far from training examples.

2020 ◽  
Vol 2020 ◽  
pp. 1-13 ◽  
Author(s):  
Jordan Ott ◽  
Mike Pritchard ◽  
Natalie Best ◽  
Erik Linstead ◽  
Milan Curcic ◽  
...  

Implementing artificial neural networks is commonly achieved via high-level programming languages such as Python and easy-to-use deep learning libraries such as Keras. These software libraries come preloaded with a variety of network architectures, provide autodifferentiation, and support GPUs for fast and efficient computation. As a result, a deep learning practitioner will favor training a neural network model in Python, where these tools are readily available. However, many large-scale scientific computation projects are written in Fortran, making it difficult to integrate with modern deep learning methods. To alleviate this problem, we introduce a software library, the Fortran-Keras Bridge (FKB). This two-way bridge connects environments where deep learning resources are plentiful with those where they are scarce. The paper describes several unique features offered by FKB, such as customizable layers, loss functions, and network ensembles. The paper concludes with a case study that applies FKB to address open questions about the robustness of an experimental approach to global climate simulation, in which subgrid physics are outsourced to deep neural network emulators. In this context, FKB enables a hyperparameter search of one hundred plus candidate models of subgrid cloud and radiation physics, initially implemented in Keras, to be transferred and used in Fortran. Such a process allows the model’s emergent behavior to be assessed, i.e., when fit imperfections are coupled to explicit planetary-scale fluid dynamics. The results reveal a previously unrecognized strong relationship between offline validation error and online performance, in which the choice of the optimizer proves unexpectedly critical. This in turn reveals many new neural network architectures that produce considerable improvements in climate model stability including some with reduced error, for an especially challenging training dataset.


2021 ◽  
Vol 2 (1) ◽  
pp. 96
Author(s):  
Umberto Michelucci ◽  
Francesca Venturini

The determination of multiple parameters via luminescence sensing is of great interest for many applications in different fields, like biosensing and biological imaging, medicine, and diagnostics. The typical approach consists in measuring multiple quantities and in applying complex and frequently just approximated mathematical models to characterize the sensor response. The use of machine learning to extract information from measurements in sensors have been tried in several forms before. But one of the problems with the approaches so far, is the difficulty in getting a training dataset that is representative of the measurements done by the sensor. Additionally, extracting multiple parameters from a single measurement has been so far an impossible problem to solve efficiently in luminescence. In this work a new approach is described for building an autonomous intelligent sensor, which is able to produce the training dataset self-sufficiently, use it for training a neural network, and then use the trained model to do inference on measurements done on the same hardware. For the first time the use of machine learning additionally allows to extract two parameters from one single measurement using multitask learning neural network architectures. This is demonstrated here by a dual oxygen concentration and temperature sensor.


2019 ◽  
Vol 2019 (1) ◽  
pp. 153-158
Author(s):  
Lindsay MacDonald

We investigated how well a multilayer neural network could implement the mapping between two trichromatic color spaces, specifically from camera R,G,B to tristimulus X,Y,Z. For training the network, a set of 800,000 synthetic reflectance spectra was generated. For testing the network, a set of 8,714 real reflectance spectra was collated from instrumental measurements on textiles, paints and natural materials. Various network architectures were tested, with both linear and sigmoidal activations. Results show that over 85% of all test samples had color errors of less than 1.0 ΔE2000 units, much more accurate than could be achieved by regression.


2020 ◽  
Vol 2020 (10) ◽  
pp. 181-1-181-7
Author(s):  
Takahiro Kudo ◽  
Takanori Fujisawa ◽  
Takuro Yamaguchi ◽  
Masaaki Ikehara

Image deconvolution has been an important issue recently. It has two kinds of approaches: non-blind and blind. Non-blind deconvolution is a classic problem of image deblurring, which assumes that the PSF is known and does not change universally in space. Recently, Convolutional Neural Network (CNN) has been used for non-blind deconvolution. Though CNNs can deal with complex changes for unknown images, some CNN-based conventional methods can only handle small PSFs and does not consider the use of large PSFs in the real world. In this paper we propose a non-blind deconvolution framework based on a CNN that can remove large scale ringing in a deblurred image. Our method has three key points. The first is that our network architecture is able to preserve both large and small features in the image. The second is that the training dataset is created to preserve the details. The third is that we extend the images to minimize the effects of large ringing on the image borders. In our experiments, we used three kinds of large PSFs and were able to observe high-precision results from our method both quantitatively and qualitatively.


2020 ◽  
Vol 10 (1) ◽  
Author(s):  
Young-Gon Kim ◽  
Sungchul Kim ◽  
Cristina Eunbee Cho ◽  
In Hye Song ◽  
Hee Jin Lee ◽  
...  

AbstractFast and accurate confirmation of metastasis on the frozen tissue section of intraoperative sentinel lymph node biopsy is an essential tool for critical surgical decisions. However, accurate diagnosis by pathologists is difficult within the time limitations. Training a robust and accurate deep learning model is also difficult owing to the limited number of frozen datasets with high quality labels. To overcome these issues, we validated the effectiveness of transfer learning from CAMELYON16 to improve performance of the convolutional neural network (CNN)-based classification model on our frozen dataset (N = 297) from Asan Medical Center (AMC). Among the 297 whole slide images (WSIs), 157 and 40 WSIs were used to train deep learning models with different dataset ratios at 2, 4, 8, 20, 40, and 100%. The remaining, i.e., 100 WSIs, were used to validate model performance in terms of patch- and slide-level classification. An additional 228 WSIs from Seoul National University Bundang Hospital (SNUBH) were used as an external validation. Three initial weights, i.e., scratch-based (random initialization), ImageNet-based, and CAMELYON16-based models were used to validate their effectiveness in external validation. In the patch-level classification results on the AMC dataset, CAMELYON16-based models trained with a small dataset (up to 40%, i.e., 62 WSIs) showed a significantly higher area under the curve (AUC) of 0.929 than those of the scratch- and ImageNet-based models at 0.897 and 0.919, respectively, while CAMELYON16-based and ImageNet-based models trained with 100% of the training dataset showed comparable AUCs at 0.944 and 0.943, respectively. For the external validation, CAMELYON16-based models showed higher AUCs than those of the scratch- and ImageNet-based models. Model performance for slide feasibility of the transfer learning to enhance model performance was validated in the case of frozen section datasets with limited numbers.


Sensors ◽  
2021 ◽  
Vol 21 (11) ◽  
pp. 3813
Author(s):  
Athanasios Anagnostis ◽  
Aristotelis C. Tagarakis ◽  
Dimitrios Kateris ◽  
Vasileios Moysiadis ◽  
Claus Grøn Sørensen ◽  
...  

This study aimed to propose an approach for orchard trees segmentation using aerial images based on a deep learning convolutional neural network variant, namely the U-net network. The purpose was the automated detection and localization of the canopy of orchard trees under various conditions (i.e., different seasons, different tree ages, different levels of weed coverage). The implemented dataset was composed of images from three different walnut orchards. The achieved variability of the dataset resulted in obtaining images that fell under seven different use cases. The best-trained model achieved 91%, 90%, and 87% accuracy for training, validation, and testing, respectively. The trained model was also tested on never-before-seen orthomosaic images or orchards based on two methods (oversampling and undersampling) in order to tackle issues with out-of-the-field boundary transparent pixels from the image. Even though the training dataset did not contain orthomosaic images, it achieved performance levels that reached up to 99%, demonstrating the robustness of the proposed approach.


2021 ◽  
pp. 159101992110009
Author(s):  
Xinke Liu ◽  
Junqiang Feng ◽  
Zhenzhou Wu ◽  
Zhonghao Neo ◽  
Chengcheng Zhu ◽  
...  

Objective Accurate diagnosis and measurement of intracranial aneurysms are challenging. This study aimed to develop a 3D convolutional neural network (CNN) model to detect and segment intracranial aneurysms (IA) on 3D rotational DSA (3D-RA) images. Methods 3D-RA images were collected and annotated by 5 neuroradiologists. The annotated images were then divided into three datasets: training, validation, and test. A 3D Dense-UNet-like CNN (3D-Dense-UNet) segmentation algorithm was constructed and trained using the training dataset. Diagnostic performance to detect aneurysms and segmentation accuracy was assessed for the final model on the test dataset using the free-response receiver operating characteristic (FROC). Finally, the CNN-inferred maximum diameter was compared against expert measurements by Pearson’s correlation and Bland-Altman limits of agreement (LOA). Results A total of 451 patients with 3D-RA images were split into n = 347/41/63 training/validation/test datasets, respectively. For aneurysm detection, observed FROC analysis showed that the model managed to attain a sensitivity of 0.710 at 0.159 false positives (FP)/case, and 0.986 at 1.49 FP/case. The proposed method had good agreement with reference manual aneurysmal maximum diameter measurements (8.3 ± 4.3 mm vs. 7.8 ± 4.8 mm), with a correlation coefficient r = 0.77, small bias of 0.24 mm, and LOA of -6.2 to 5.71 mm. 37.0% and 77% of diameter measurements were within ±1 mm and ±2.5 mm of expert measurements. Conclusions A 3D-Dense-UNet model can detect and segment aneurysms with relatively high accuracy using 3D-RA images. The automatically measured maximum diameter has potential clinical application value.


Sign in / Sign up

Export Citation Format

Share Document