temporal masking
Recently Published Documents


TOTAL DOCUMENTS

134
(FIVE YEARS 10)

H-INDEX

17
(FIVE YEARS 1)

2021 ◽  
Vol 11 (1) ◽  
Author(s):  
Apostolos Argyris ◽  
Janek Schwind ◽  
Ingo Fischer

AbstractAlbeit the conceptual simplicity of hardware reservoir computing, the various implementation schemes that have been proposed so far still face versatile challenges. The conceptually simplest implementation uses a time delay approach, where one replaces the ensemble of nonlinear nodes with a unique nonlinear node connected to a delayed feedback loop. This simplification comes at a price in other parts of the implementation; repetitive temporal masking sequences are required to map the input information onto the diverse states of the time delay reservoir. These sequences are commonly introduced by arbitrary waveform generators which is an expensive approach when exploring ultra-fast processing speeds. Here we propose the physical generation of clock-free, sub-nanosecond repetitive patterns, with increased intra-pattern diversity and their use as masking sequences. To that end, we investigate numerically a semiconductor laser with a short optical feedback cavity, a well-studied dynamical system that provides a wide diversity of emitted signals. We focus on those operating conditions that lead to a periodic signal generation, with multiple harmonic frequency tones and sub-nanosecond limit cycle dynamics. By tuning the strength of the different frequency tones in the microwave domain, we access a variety of repetitive patterns and sample them in order to obtain the desired masking sequences. Eventually, we apply them in a time delay reservoir computing approach and test them in a nonlinear time-series prediction task. In a performance comparison with masking sequences that originate from random values, we find that only minor compromises are made while significantly reducing the instrumentation requirements of the time delay reservoir computing system.


PLoS ONE ◽  
2021 ◽  
Vol 16 (1) ◽  
pp. e0244433
Author(s):  
Eugen Kludt ◽  
Waldo Nogueira ◽  
Thomas Lenarz ◽  
Andreas Buechner

Auditory masking occurs when one sound is perceptually altered by the presence of another sound. Auditory masking in the frequency domain is known as simultaneous masking and in the time domain is known as temporal masking or non-simultaneous masking. This works presents a sound coding strategy that incorporates a temporal masking model to select the most relevant channels for stimulation in a cochlear implant (CI). A previous version of the strategy, termed psychoacoustic advanced combination encoder (PACE), only used a simultaneous masking model for the same purpose, for this reason the new strategy has been termed temporal-PACE (TPACE). We hypothesized that a sound coding strategy that focuses on stimulating the auditory nerve with pulses that are as masked as possible can improve speech intelligibility for CI users. The temporal masking model used within TPACE attenuates the simultaneous masking thresholds estimated by PACE over time. The attenuation is designed to fall exponentially with a strength determined by a single parameter, the temporal masking half-life T½. This parameter gives the time interval at which the simultaneous masking threshold is halved. The study group consisted of 24 postlingually deaf subjects with a minimum of six months experience after CI activation. A crossover design was used to compare four variants of the new temporal masking strategy TPACE (T½ ranging between 0.4 and 1.1 ms) with respect to the clinical MP3000 strategy, a commercial implementation of the PACE strategy, in two prospective, within-subject, repeated-measure experiments. The outcome measure was speech intelligibility in noise at 15 to 5 dB SNR. In two consecutive experiments, the TPACE with T½ of 0.5 ms obtained a speech performance increase of 11% and 10% with respect to the MP3000 (T½ = 0 ms), respectively. The improved speech test scores correlated with the clinical performance of the subjects: CI users with above-average outcome in their routine speech tests showed higher benefit with TPACE. It seems that the consideration of short-acting temporal masking can improve speech intelligibility in CI users. The half-live with the highest average speech perception benefit (0.5 ms) corresponds to time scales that are typical for neuronal refractory behavior.


2021 ◽  
Vol 25 ◽  
pp. 233121652110161
Author(s):  
Michal Fereczkowski ◽  
Torsten Dau ◽  
Ewen N. MacDonald

While an audiogram is a useful method of characterizing hearing loss, it has been suggested that including a complementary, suprathreshold measure, for example, a measure of the status of the cochlear active mechanism, could lead to improved diagnostics and improved hearing-aid fitting in individual listeners. While several behavioral and physiological methods have been proposed to measure the cochlear-nonlinearity characteristics, evidence of a good correspondence between them is lacking, at least in the case of hearing-impaired listeners. If this lack of correspondence is due to, for example, limited reliability of one of such measures, it might be a reason for limited evidence of the benefit of measuring peripheral compression. The aim of this study was to investigate the relation between measures of the peripheral-nonlinearity status estimated using two psychoacoustical methods (based on the notched-noise and temporal-masking curve methods) and otoacoustic emissions, on a large sample of hearing-impaired listeners. While the relation between the estimates from the notched-noise and the otoacoustic emissions experiments was found to be stronger than predicted by the audiogram alone, the relations between the two measures and the temporal-masking based measure did not show the same pattern, that is, the variance shared by any of the two measures with the temporal-masking curve-based measure was also shared with the audiogram.


Acta Acustica ◽  
2020 ◽  
Vol 5 ◽  
pp. 1
Author(s):  
Florian Wendt ◽  
Robert Höldrich

Studies on the precedence effect are typically conducted by presenting two identical sounds simulating direct sound and specular reflection. However, when a sound is reflected from irregular surface, it is redirect into many directions resulting in directional and temporal diffusion. This contribution introduces a simulation of Lambertian diffusing reflections. The perceptual influences of diffusion are studied in a listening experiment; echo thresholds and masked thresholds of specular and diffuse reflections are measured. Results show that diffusion makes the reflections more easily detectable than specular reflections of the same total energy. Indications are found that this mainly due to temporal diffusion, while the directional diffusion has little effect. Accordingly, the modeling of the echo thresholds is achieved by a temporal alignment of the experimental data based on the energy centroid of reflection responses. For the modeling of masked threshold the temporal masking pattern for forward masking is taken into account.


2020 ◽  
pp. 002029402094494
Author(s):  
Ali Akbar Siddique ◽  
M Tahir Qadr ◽  
Zia Mohy-Ud-Din

Every video stream possesses temporal redundancy based on the amount of motion presenting in it. An ample amount of motion in a video sequence may cause distorting artifacts, and in order to avoid them, there is a possibility to mask the motion or temporal activity that is not noticeable to a human eye in real time. The artifacts such as blockiness and blurriness are instigated in the video sequence as soon as it is subjected to the process of compression, and they tend to become more and more intense with the increase in temporal activity. In this paper, an algorithm is proposed to mask the temporal activity using temporal masking coefficient ( q) that is unnoticeable by a human eye to bring down the distortion levels. It is possible to adjust the quality of the video sequence by varying the q parameter and thus controlling its overall quality index. Frames are extracted from the video sequence, and displacement or motion vectors are also calculated from the consecutive frames using a bi-directional block matching algorithm. These motion vectors are used to estimate the quantity of motion present between consecutive frames of the same scene. Video sequences used for this purpose are basically H.264 format. Temporal masking is performed on a video sequence with and without the implementation of motion vector. Structural similarity index and peak signal-to-noise ratio are the quality measurement tools used to assess the performance of the proposed algorithm. A bit rate of 1.2% was saved by implementing proposed algorithm at q = 1 in contrast to the standard H.264/Advanced Video Coding.


2020 ◽  
Author(s):  
Eugen Kludt ◽  
Waldo Nogueira ◽  
Thomas Lenarz ◽  
Andreas Büchner

Objective: Evaluation of speech intelligibility benefit by addition of a temporal masking model to the MP3000 speech coding strategy. Study Design: A crossover design was used to compare variants of the new temporal masking strategy TPACE with respect to the clinical MP3000 strategy. Setting: The study was a prospective, within-subject, repeated-measure experiments.Patients: The study group consisted of 24 postlingually deaf subjects with a minimum of six months experience after CI activation.Intervention: Five conditions were evaluated: The original MP3000 algorithm and the TPACE with four different settings of the temporal masking parameter.Main Outcome Measures: Speech perception test in Comité Consultatif International Téléphonique et Télégraphique (CCITT) noise at 15 to 5 dB SNR. Results: Average speech intelligibility improved with TPACE (T½ = 0.5 ms) by 10 % to 11% compared to MP3000. Conclusion: The addition of temporal masking to the simultaneous masking model of MP3000 could improve the speech intelligibility in noise


2020 ◽  
Author(s):  
Wiebke Lamping ◽  
Tobias Goehring ◽  
Jeremy Marozeau ◽  
Robert P. Carlyon

Speech recognition in noisy environments remains a challenge for cochlear implant (CI) recipients. Unwanted charge interactions between current pulses in the same and across different electrode channels are likely to impair performance. Here we investigate the effect of reducing the number of current pulses on speech perception. This was achieved by implementing a psychoacoustic temporal-masking model where current pulses in each channel were passed through a temporal integrator to identify and remove pulses that were less likely to be perceived by the recipient. The decision criterion of the temporal integrator was varied to control the percentage of pulses removed in each condition. In experiment 1, speech in quiet was processed with a standard Continuous Interleaved Sampling (CIS) strategy and with 25, 50 and 75% of pulses removed. In experiment 2, performance was measured for speech in noise with the CIS reference and with 50 and 75% of pulses removed. Speech intelligibility in quiet revealed no significant difference between reference and test conditions. For speech in noise, results showed a significant improvement of 2.4 dB when removing 50% of pulses. Performance both in quiet and in noise was not significantly different between the reference and when 75% of pulses were removed. Further, by reducing the overall amount of current pulses by 25, 50, and 75% but accounting for the increase in charge necessary to compensate for the decrease in loudness, estimated average power savings of 21.15, 40.95, and 63.45%, respectively, could be possible for this set of listeners. In conclusion, removing temporally masked pulses may improve speech perception in noise and result in substantial power savings.


Sign in / Sign up

Export Citation Format

Share Document