image transformations
Recently Published Documents


TOTAL DOCUMENTS

164
(FIVE YEARS 40)

H-INDEX

17
(FIVE YEARS 3)

Author(s):  
J. Venton ◽  
P. M. Harris ◽  
A. Sundar ◽  
N. A. S. Smith ◽  
P. J. Aston

The electrocardiogram (ECG) is a widespread diagnostic tool in healthcare and supports the diagnosis of cardiovascular disorders. Deep learning methods are a successful and popular technique to detect indications of disorders from an ECG signal. However, there are open questions around the robustness of these methods to various factors, including physiological ECG noise. In this study, we generate clean and noisy versions of an ECG dataset before applying symmetric projection attractor reconstruction (SPAR) and scalogram image transformations. A convolutional neural network is used to classify these image transforms. For the clean ECG dataset, F1 scores for SPAR attractor and scalogram transforms were 0.70 and 0.79, respectively. Scores decreased by less than 0.05 for the noisy ECG datasets. Notably, when the network trained on clean data was used to classify the noisy datasets, performance decreases of up to 0.18 in F1 scores were seen. However, when the network trained on the noisy data was used to classify the clean dataset, the decrease was less than 0.05. We conclude that physiological ECG noise impacts classification using deep learning methods and careful consideration should be given to the inclusion of noisy ECG signals in the training data when developing supervised networks for ECG classification. This article is part of the theme issue ‘Advanced computation in cardiovascular physiology: new challenges and opportunities’.


2021 ◽  
Author(s):  
Guanyue Li ◽  
Yi Liu ◽  
Xiwen Wei ◽  
Yang Zhang ◽  
Si Wu ◽  
...  

2021 ◽  
Vol 11 (16) ◽  
pp. 7397
Author(s):  
Mauricio Maldonado-Chan ◽  
Andres Mendez-Vazquez ◽  
Ramon Osvaldo Guardado-Medina

Gated networks are networks that contain gating connections in which the output of at least two neurons are multiplied. The basic idea of a gated restricted Boltzmann machine (RBM) model is to use the binary hidden units to learn the conditional distribution of one image (the output) given another image (the input). This allows the hidden units of a gated RBM to model the transformations between two successive images. Inference in the model consists in extracting the transformations given a pair of images. However, a fully connected multiplicative network creates cubically many parameters, forming a three-dimensional interaction tensor that requires a lot of memory and computations for inference and training. In this paper, we parameterize the bilinear interactions in the gated RBM through a multimodal tensor-based Tucker decomposition. Tucker decomposition decomposes a tensor into a set of matrices and one (usually smaller) core tensor. The parameterization through Tucker decomposition helps reduce the number of model parameters, reduces the computational costs of the learning process and effectively strengthens the structured feature learning. When trained on affine transformations of still images, we show how a completely unsupervised network learns explicit encodings of image transformations.


2021 ◽  
Author(s):  
Peter Washington ◽  
Emilie Leblanc ◽  
Kaitlyn Dunlap ◽  
Aaron Kline ◽  
Cezmi Mutlu ◽  
...  

Artificial Intelligence (A.I.) solutions are increasingly considered for telemedicine. For these methods to adapt to the field of behavioral pediatrics, serving children and their families in home settings, it will be crucial to ensure the privacy of the child and parent subjects in the videos. To address this challenge in A.I. for healthcare, we explore the potential for global image transformations to provide privacy while preserving behavioral annotation quality. Crowd workers have previously been shown to reliably annotate behavioral features in unstructured home videos, allowing machine learning classifiers to detect autism using the annotations as input. We evaluate this method with videos altered via pixelation, dense optical flow, and Gaussian blurring. On a balanced test set of 30 videos of children with autism and 30 neurotypical controls, we find that the visual privacy alterations do not drastically alter any individual behavioral annotation at the item level. The AUROC on the evaluation set was 90.0% +/- 7.5% for the unaltered condition, 85.0% +/- 9.0% for pixelation, 85.0% +/- 9.0% for optical flow, and 83.3% +/- 9.3% for blurring, demonstrating that an aggregation of small changes across multiple behavioral questions can collectively result in increased misdiagnosis rates. We also compare crowd answers against clinicians who provided the same annotations on the same videos and find that clinicians are more sensitive to autism-related symptoms. We also find that there is a linear correlation (r=0.75, p<0.0001) between the mean Clinical Global Impression (CGI) score provided by professional clinicians and the corresponding classifier score emitted by the logistic regression classifier with crowd inputs, indicating that the classifier's output probability is a reliable estimate of clinical impression of autism from home videos. A significant correlation is maintained with privacy alterations, indicating that crowd annotations can approximate clinician-provided autism impression from home videos in a privacy-preserved manner.


2021 ◽  
Author(s):  
Aran Nayebi ◽  
Nathan C. L. Kong ◽  
Chengxu Zhuang ◽  
Justin L. Gardner ◽  
Anthony M. Norcia ◽  
...  

Task-optimized deep convolutional neural networks are the most quantitatively accurate models of the primate ventral visual stream. However, such networks are implausible as a model of the mouse visual system because mouse visual cortex has a known shallower hierarchy and the supervised objectives these networks are typically trained with are likely neither ethologically relevant in content nor in quantity. Here we develop shallow network architectures that are more consistent with anatomical and physiological studies of mouse visual cortex than current models. We demonstrate that hierarchically shallow architectures trained using contrastive objective functions applied to visual-acuity-adapted images achieve neural prediction performance that exceed those of the same architectures trained in a supervised manner and result in the most quantitatively accurate models of the mouse visual system. Moreover, these models' neural predictivity significantly surpasses those of supervised, deep architectures that are known to correspond well to the primate ventral visual stream. Finally, we derive a novel measure of inter-animal consistency, and show that the best models closely match this quantity across visual areas. Taken together, our results suggest that contrastive objectives operating on shallow architectures with ethologically-motivated image transformations may be a biologically-plausible computational theory of visual coding in mice.


2021 ◽  
Author(s):  
Rosyl Somai ◽  
Peter Hancock

One challenge in exploring the internal representation of faces is the lack of controlled stimuli transformations. Researchers are often limited to verbalizable transformations in the creation of a dataset. An alternative approach to verbalization for interpretability is finding image-based measures that allow us to quantify image transformations. In this study, we explore whether PCA could be used to create controlled transformations to a face by testing the effect of these transformations on human perceptual similarity and on computational differences in Gabor, Pixel and DNN spaces. We found that perceptual similarity and the three image-based spaces are linearly related, almost perfectly in the case of the DNN, with a correlation of 0.94. This provides a controlled way to alter the appearance of a face. In experiment 2, the effect of familiarity on the perception of multidimensional transformations was explored. Our findings show that there is a positive relationship between the number of components transformed and both the perceptual similarity and the same three image-based spaces used in experiment 1. Furthermore, we found that familiar faces are rated more similar overall than unfamiliar faces. That is, a change to a familiar face is perceived as making less difference than the exact same change to an unfamiliar face. The ability to quantify, and thus control, these transformations is a powerful tool in exploring the factors that mediate a change in perceived identity.


IEEE Access ◽  
2021 ◽  
pp. 1-1
Author(s):  
Marin Bencevic ◽  
Irena Galic ◽  
Marija Habijan ◽  
Danilo Babin

Sign in / Sign up

Export Citation Format

Share Document