perceptual distance
Recently Published Documents


TOTAL DOCUMENTS

96
(FIVE YEARS 35)

H-INDEX

12
(FIVE YEARS 2)

2022 ◽  
Vol 12 (2) ◽  
pp. 827
Author(s):  
Ki-Seung Lee

Moderate performance in terms of intelligibility and naturalness can be obtained using previously established silent speech interface (SSI) methods. Nevertheless, a common problem associated with SSI has involved deficiencies in estimating the spectrum details, which results in synthesized speech signals that are rough, harsh, and unclear. In this study, harmonic enhancement (HE), was used during postprocessing to alleviate this problem by emphasizing the spectral fine structure of speech signals. To improve the subjective quality of synthesized speech, the difference between synthesized and actual speech was established by calculating the distance in the perceptual domains instead of using the conventional mean square error (MSE). Two deep neural networks (DNNs) were employed to separately estimate the speech spectra and the filter coefficients of HE, connected in a cascading manner. The DNNs were trained to incrementally and iteratively minimize both the MSE and the perceptual distance (PD). A feasibility test showed that the perceptual evaluation of speech quality (PESQ) and the short-time objective intelligibility measure (STOI) were improved by 17.8 and 2.9%, respectively, compared with previous methods. Subjective listening tests revealed that the proposed method yielded perceptually preferred results compared with that of the conventional MSE-based method.


Author(s):  
Hiroki Higuchi ◽  
Tessei Kobayashi

AbstractLetter similarity (i.e., perceptual distance) is a critical measure to better understand letter perception and literacy development. Despite its importance, however, measurements of letter similarity for non-alphabetic scripts are limited, and the shortage of letter similarity for non-alphabetic script interferes with the identification of the universality and the uniqueness of letter perception systems across different scripts. In the present study, we provide a comprehensive matrix of letter similarity for Japanese kana letters (hiragana and katakana). We obtained the discrimination reaction times for simultaneously presented letter pairs and calculated the perceptual distance of 4,278 letter pairs by inversing the time. The matrix showed significant correlations with previously obtained letter similarity for hiragana and katakana. An additional experiment showed that letter pairs for the same sounds (え–エ) produced significantly slower responses compared with those for different sounds (え–コ). However, the differences in reaction times between the same and different sound conditions were smaller than the sequentially presented conditions, suggesting that the matrix was partially attributable to knowledge-based factors (e.g., letter-sound knowledge). This first comprehensive matrix of letter similarity (i.e., perceptual distance) for Japanese kana letters (hiragana and katakana) will be useful for researchers interested in letter perception and literacy development.


2022 ◽  
Vol 2161 (1) ◽  
pp. 012024
Author(s):  
Padmashree Desai ◽  
C Sujatha ◽  
Saumyajit Chakraborty ◽  
Saurav Ansuman ◽  
Sanika Bhandari ◽  
...  

Abstract Intelligent decision-making systems require the potential for forecasting, foreseeing, and reasoning about future events. The issue of video frame prediction has aroused a lot of attention due to its usefulness in many computer vision applications such as autonomous vehicles and robots. Recent deep learning advances have significantly improved video prediction performance. Nevertheless, as top-performing systems attempt to foresee even more future frames, their predictions become increasingly foggy. We developed a method for predicting a future frame based on a series of prior frames that services the Convolutional Long-Short Term Memory (ConvLSTM) model. The input video is segmented into frames, fed to the ConvLSTM model to extract the features and forecast a future frame which can be beneficial in a variety of applications. We have used two metrics to measure the quality of the predicted frame: structural similarity index (SSIM) and perceptual distance, which help in understanding the difference between the actual frame and the predicted frame. The UCF101 data set is used for testing and training in the project. It is a data collection of realistic action videos taken from YouTube with 101 action categories for action detection. The ConvLSTM model is trained and tested for 24 categories from this dataset and a future frame is predicted which yields satisfactory results. We obtained SSIM as 0.95 and perceptual similarity as 24.28 for our system. The suggested work’s results are also compared to those of state-of-the-art approaches, which are shown to be superior.


2021 ◽  
Author(s):  
◽  
Jean Ann Patterson

<p>Change was a constant companion for New Zealand midwives during the 1990's. The Nurses Amendment Act 1990, that restored midwifery autonomy was only one of a constellation of changes that saw significant restructuring of the health services in small communities. The purpose of this study was to look at the issues for a group of midwives in rural South Otago who took the opportunity to work independently and offer local women a choice of maternity care during this time. In this study, five rural midwives were interviewed and met subsequently in a focus group. The transcripts were analyzed using discourse analysis informed by a postmodern/feminist theoretical framework. In addition the local newspapers covering the years 1990-1999 were read with a particular focus on the reports of health changes. These texts were also subjected to a discourse analysis using Lyotard's (1997) notion of language games, and bell hook's (1990) ideas around strategic positioning for the marginalised. To practise autonomously, the midwives in this study perform an intricate dance, balancing the contradictions of competing discourses. Their positioning and place of difference is tensioned primarily by a deep sense of community commitment and entanglement, and also by a feeling of physical and perceptual distance from their urban midwifery colleagues. This is underpinned by a staunch belief in women's ability to birth safely in their local area. The findings of this study suggest that the continuation of a comprehensive rural midwifery service is challenged by changes in the arrangement and funding of rural health, plus the increasing use of medical and technological intervention in childbirth. For rural midwifery to survive, this study shows that midwives need to remain flexible and alert while continuing to align themselves with women who are their primary source of support and inspiration. At the same time, they need to forge strategic linkages and alliances, both local and national that will allow them to move and reposition in order to continue their work and provide a realistic childbirth choice for rural women.</p>


2021 ◽  
Author(s):  
◽  
Jean Ann Patterson

<p>Change was a constant companion for New Zealand midwives during the 1990's. The Nurses Amendment Act 1990, that restored midwifery autonomy was only one of a constellation of changes that saw significant restructuring of the health services in small communities. The purpose of this study was to look at the issues for a group of midwives in rural South Otago who took the opportunity to work independently and offer local women a choice of maternity care during this time. In this study, five rural midwives were interviewed and met subsequently in a focus group. The transcripts were analyzed using discourse analysis informed by a postmodern/feminist theoretical framework. In addition the local newspapers covering the years 1990-1999 were read with a particular focus on the reports of health changes. These texts were also subjected to a discourse analysis using Lyotard's (1997) notion of language games, and bell hook's (1990) ideas around strategic positioning for the marginalised. To practise autonomously, the midwives in this study perform an intricate dance, balancing the contradictions of competing discourses. Their positioning and place of difference is tensioned primarily by a deep sense of community commitment and entanglement, and also by a feeling of physical and perceptual distance from their urban midwifery colleagues. This is underpinned by a staunch belief in women's ability to birth safely in their local area. The findings of this study suggest that the continuation of a comprehensive rural midwifery service is challenged by changes in the arrangement and funding of rural health, plus the increasing use of medical and technological intervention in childbirth. For rural midwifery to survive, this study shows that midwives need to remain flexible and alert while continuing to align themselves with women who are their primary source of support and inspiration. At the same time, they need to forge strategic linkages and alliances, both local and national that will allow them to move and reposition in order to continue their work and provide a realistic childbirth choice for rural women.</p>


2021 ◽  
Vol 111 (10) ◽  
pp. 3225-3255
Author(s):  
Benjamin Hébert ◽  
Michael Woodford

We derive a new cost of information in rational inattention problems, the neighborhood-based cost functions, starting from the observation that many settings involve exogenous states with a topological structure. These cost functions are uniformly posterior separable and capture notions of perceptual distance. This second property ensures that neighborhood-based costs, unlike mutual information, make accurate predictions about behavior in perceptual experiments. We compare the implications of our neighborhood-based cost functions with those of the mutual information in a series of applications: perceptual judgments, the general environment of binary choice, regime-change games, and linear-quadratic-Gaussian settings. (JEL C70, D11, D82, D83, D91)


2021 ◽  
Author(s):  
Pierre-Yves Jonin ◽  
Julie Coloignier ◽  
Elise Bannier ◽  
Gabriel Besson

Humans can recognize thousands of visual objects after a single exposure, even against highly confusable objects, and despite viewpoint changes between learning and recognition. Memory consolidation processes like those taking place during wakeful rest contribute to such a feat, possibly by protecting the fine details of objects’ representations. However, whether rest-related consolidation promotes the viewpoint invariance of mnemonic representations for individual objects remains unexplored.Fifteen participants underwent a speeded visual recognition memory task tapping on familiarity-based recognition of individual objects, across four conditions manipulating post- encoding rest. Viewpoints of target items were modified between study and test while controlling study-test perceptual distance, and targets and lures shared the same subordinate category, making recognition independent from perceptual and conceptual fluency. Performance was very accurate, even without post-encoding rest, which did not enhance memory. However, rest uniquely made target detection immune to study-test perceptual distance.These findings suggest that very short periods of wakeful rest (down to 2-sec post-stimulus) suffice to achieve complete mnemonic viewpoint-invariance, pushing forward the strength of post-encoding rest in learning and memory. They also strongly argue for a holistic, viewpoint- invariant, mnemonic representation of visual objects.


2021 ◽  
Vol 1 ◽  
Author(s):  
Nicolás Vattuone ◽  
Thomas Wachtler ◽  
Inés Samengo

Our sensory systems transform external signals into neural activity, thereby producing percepts. We are endowed with an intuitive notion of similarity between percepts, that need not reflect the proximity of the physical properties of the corresponding external stimuli. The quantitative characterization of the geometry of percepts is therefore an endeavour that must be accomplished behaviorally. Here we characterized the geometry of color space using discrimination and matching experiments. We proposed an individually tailored metric defined in terms of the minimal chromatic difference required for each observer to differentiate a stimulus from its surround. Next, we showed that this perceptual metric was particularly adequate to describe two additional experiments, since it revealed the natural symmetry of perceptual computations. In one of the experiments, observers were required to discriminate two stimuli surrounded by a chromaticity that differed from that of the tested stimuli. In the perceptual coordinates, the change in discrimination thresholds induced by the surround followed a simple law that only depended on the perceptual distance between the surround and each of the two compared stimuli. In the other experiment, subjects were asked to match the color of two stimuli surrounded by two different chromaticities. Again, in the perceptual coordinates the induction effect produced by surrounds followed a simple, symmetric law. We conclude that the individually-tailored notion of perceptual distance reveals the symmetry of the laws governing perceptual computations. Comment: 42 pages, 9 figures, 1 appendix. (v2) 47 pages, 10 figures, 1 appendix. (v3) Text modified after peer-review process. (v4) 34 pages, 1 appendix, 10 figures. Article accepted to be published at Mathematical Neuroscience and Applications


2021 ◽  
Vol 4 ◽  
Author(s):  
Carson Molder ◽  
Benjamin Lowe ◽  
Justin Zhan

Deep learning models have been shown to be effective for material analysis, a subfield of computer vision, on natural images. In medicine, deep learning systems have been shown to more accurately analyze radiography images than algorithmic approaches and even experts. However, one major roadblock to applying deep learning-based material analysis on radiography images is a lack of material annotations accompanying image sets. To solve this, we first introduce an automated procedure to augment annotated radiography images into a set of material samples. Next, using a novel Siamese neural network that compares material sample pairs, called D-CNN, we demonstrate how to learn a perceptual distance metric between material categories. This system replicates the actions of human annotators by discovering attributes that encode traits that distinguish materials in radiography images. Finally, we update and apply MAC-CNN, a material recognition neural network, to demonstrate this system on a dataset of knee X-rays and brain MRIs with tumors. Experiments show that this system has strong predictive power on these radiography images, achieving 92.8% accuracy at predicting the material present in a local region of an image. Our system also draws interesting parallels between human perception of natural materials and materials in radiography images.


Sign in / Sign up

Export Citation Format

Share Document