formant frequencies
Recently Published Documents


TOTAL DOCUMENTS

319
(FIVE YEARS 54)

H-INDEX

30
(FIVE YEARS 2)

2021 ◽  
Author(s):  
Hyojin Kim ◽  
Viktorija Ratkute ◽  
Bastian Epp

Comodulated masking noise and binaural cues can facilitate detecting a target sound from noise. These cues can induce a decrease in detection thresholds, quantified as comodulation masking release (CMR) and binaural masking level difference (BMLD), respectively. However, their relevance to speech perception is unclear as most studies have used artificial stimuli different from speech. Here, we investigated their ecological validity using sounds with speech-like spectro-temporal dynamics. We evaluated the ecological validity of such grouping effect with stimuli reflecting formant changes in speech. We set three masker bands at formant frequencies F1, F2, and F3 based on CV combination: /gu/, /fu/, and /pu/. We found that the CMR was little (< 3 dB) while BMLD was comparable to previous findings (~ 9 dB). In conclusion, we suggest that other features may play a role in facilitating frequency grouping by comodulation such as the spectral proximity and the number of masker bands.


2021 ◽  
Vol 15 ◽  
Author(s):  
Jonathan Melchor ◽  
José Vergara ◽  
Tonatiuh Figueroa ◽  
Isaac Morán ◽  
Luis Lemus

In social animals, identifying sounds is critical for communication. In humans, the acoustic parameters involved in speech recognition, such as the formant frequencies derived from the resonance of the supralaryngeal vocal tract, have been well documented. However, how formants contribute to recognizing learned sounds in non-human primates remains unclear. To determine this, we trained two rhesus monkeys to discriminate target and non-target sounds presented in sequences of 1–3 sounds. After training, we performed three experiments: (1) We tested the monkeys’ accuracy and reaction times during the discrimination of various acoustic categories; (2) their ability to discriminate morphing sounds; and (3) their ability to identify sounds consisting of formant 1 (F1), formant 2 (F2), or F1 and F2 (F1F2) pass filters. Our results indicate that macaques can learn diverse sounds and discriminate from morphs and formants F1 and F2, suggesting that information from few acoustic parameters suffice for recognizing complex sounds. We anticipate that future neurophysiological experiments in this paradigm may help elucidate how formants contribute to the recognition of sounds.


Author(s):  
Yeptain Leung ◽  
Jennifer Oates ◽  
Siew-Pang Chan ◽  
Viktória Papp

Purpose The aim of the study was to examine associations between speaking fundamental frequency ( f os ), vowel formant frequencies ( F ), listener perceptions of speaker gender, and vocal femininity–masculinity. Method An exploratory study was undertaken to examine associations between f os , F 1 – F 3 , listener perceptions of speaker gender (nominal scale), and vocal femininity–masculinity (visual analog scale). For 379 speakers of Australian English aged 18–60 years, f os mode and F 1 – F 3 (12 monophthongs; total of 36 F s) were analyzed on a standard reading passage. Seventeen listeners rated speaker gender and vocal femininity–masculinity on randomized audio recordings of these speakers. Results Model building using principal component analysis suggested the 36 F s could be succinctly reduced to seven principal components (PCs). Generalized structural equation modeling (with the seven PCs of F and f os as predictors) suggested that only F 2 and f os predicted listener perceptions of speaker gender (male, female, unable to decide). However, listener perceptions of vocal femininity–masculinity behaved differently and were predicted by F 1 , F 3 , and the contrast between monophthongs at the extremities of the F 1 acoustic vowel space, in addition to F 2 and f os . Furthermore, listeners' perceptions of speaker gender also influenced ratings of vocal femininity–masculinity substantially. Conclusion Adjusted odds ratios highlighted the substantially larger contribution of F to listener perceptions of speaker gender and vocal femininity–masculinity relative to f os than has previously been reported.


2021 ◽  
Vol 21 (1) ◽  
Author(s):  
S.V Narasimhan ◽  
W.G.S.S Karunarathne

This study was conducted to document the effect of age, gender and vowel type on vowel space area in Sinhala language. Three groups of participants were employed. Group 1 included 20 children, Group 2 included 20 adults and Group 3 consisted of 20 elderly subjects. All the subjects spoke the dialect of central province of Sri Lanka. Words consisting of three Sinhala short vowels /a/, /i/ and /u/ in were recorded. Formant frequencies of vowels were extracted and vowel space area was constructed. The results showed that the formant frequencies were significantly higher for children compared with those of adults. Female subjects had significantly higher formant frequency values than male subjects. Effect of vowel types were also significant on the formant frequencies and vowel space area. Sinhala also follows universal criteria of resonance characteristics and vocal tract constriction. Keywords: vowel space area, formant frequencies, Sinhala, vowel articulation


MIND Journal ◽  
2021 ◽  
Vol 5 (1) ◽  
pp. 39-53
Author(s):  
IRMA AMELIA DEWI ◽  
MUHAMMAD ICHWAN ◽  
SALMA SILFIANA

AbstrakSuara manusia merupakan natural language sebagai salah satu gaya interaksi dengan komputer. Manusia mempunyai ragam suara yang berbeda, dapat dilihat dari formant, pitch dan volume suara. Masukan perintah suara yang baik bagi komputer dibutuhkan proses pencarian kualitas suara berdasarkan forman frekuensi. Pada penelitian ini tahapan proses diawali dengan pre-processing yaitu preemphasis, frame blocking dan windowing kemudian dilanjutkan pencarian nilai forman menggunakan Linear Prediction. Hasil nilai forman yang didapatkan dicocokan dengan nilai forman data latih yang berada pada database. Terdapat 2700 data suara uji dengan durasi perekaman suara dilakukan selama 1 detik. Berdasarkan hasil pengujian nilai forman yang diperoleh untuk F0 kisaran 0–423, nilai forman F1 kisaran 572-1678, nilai forman F2 kisaran 1536-2583, nilai forman F3 kisaran 2676-3384, nilai forman F4 kisaran 3519-4947.  Kata kunci: Forman, Frekuensi, Pitch , Linear PredictionAbstractHuman voice is a natural language as a style of interaction with computers. Humans have a variety of sounds can be seen from the formant, pitch and volume. Entering voice commands that are good for the computer requires the search for sound quality based on formant frequencies. In this study, the process stages begin pre-emphasis, frame blocking and windowing for noise reduction and searching formant values using Linear Prediction. The formant value obtained is matched with the formant value of the training data in database. There are 2700 test sound data with recording duration is 1 second. Based on test results obtained formant values for F0 range of 0-423, value range 572-1678 formants F1, F2 formant values range from 1536 to 2583, the value of the range of 2676-3384 formant F3, F4 formant values of the range of 3519-4947.Keywords: Formant, Frequency, Pitch, Linear prediction


Author(s):  
Christopher Dromey ◽  
Michelle Richins ◽  
Tanner Low

Purpose We examined the effect of bite block insertion (BBI) on lingual movements and formant frequencies in corner vowel and diphthong production in a sentence context. Method Twenty young adults produced the corner vowels (/u/, /ɑ/, /æ/, /i/) and the diphthong /ɑɪ/ in sentence contexts before and after BBI. An electromagnetic articulograph measured the movements of the tongue back, middle, and front. Results There were significant decreases in the acoustic vowel articulation index and vowel space area following BBI. The kinematic vowel articulation index decreased significantly for the back and middle of the tongue but not for the front. There were no significant acoustic changes post-BBI for the diphthong, other than a longer transition duration. Diphthong kinematic changes after BBI included smaller movements for the back and middle of the tongue, but not the front. Conclusions BBI led to a smaller acoustic working space for the corner vowels. The adjustments made by the front of the tongue were sufficient to compensate for the BBI perturbation in the diphthong, resulting in unchanged formant trajectories. The back and middle of the tongue were likely biomechanically restricted in their displacement by the fixation of the jaw, whereas the tongue front showed greater movement flexibility.


2021 ◽  
pp. bmjstel-2020-000727
Author(s):  
Andrew Hall ◽  
Kosuke Kawai ◽  
Kelsey Graber ◽  
Grant Spencer ◽  
Christopher Roussin ◽  
...  

IntroductionStress may serve as an adjunct (challenge) or hindrance (threat) to the learning process. Determining the effect of an individual’s response to situational demands in either a real or simulated situation may enable optimisation of the learning environment. Studies of acoustic analysis suggest that mean fundamental frequency and formant frequencies of voice vary with an individual’s response during stressful events. This hypothesis is reviewed within the otolaryngology (ORL) simulation environment to assess whether acoustic analysis could be used as a tool to determine participants’ stress response and cognitive load in medical simulation. Such an assessment could lead to optimisation of the learning environment.MethodologyORL simulation scenarios were performed to teach the participants teamwork and refine clinical skills. Each was performed in an actual operating room (OR) environment (in situ) with a multidisciplinary team consisting of ORL surgeons, OR nurses and anaesthesiologists. Ten of the scenarios were led by an ORL attending and ten were led by an ORL fellow. The vocal communication of each of the 20 individual leaders was analysed using a long-term pitch analysis PRAAT software (autocorrelation method) to obtain mean fundamental frequency (F0) and first four formant frequencies (F1, F2, F3 and F4). In reviewing individual scenarios, each leader’s voice was analysed during a non-stressful environment (WHO sign-out procedure) and compared with their voice during a stressful portion of the scenario (responding to deteriorating oxygen saturations in the manikin).ResultsThe mean unstressed F0 for the male voice was 161.4 Hz and for the female voice was 217.9 Hz. The mean fundamental frequency of speech in the ORL fellow (lead surgeon) group increased by 34.5 Hz between the scenario’s baseline and stressful portions. This was significantly different to the mean change of −0.5 Hz noted in the attending group (p=0.01). No changes were seen in F1, F2, F3 or F4.ConclusionsThis study demonstrates a method of acoustic analysis of the voices of participants taking part in medical simulations. It suggests acoustic analysis of participants may offer a simple, non-invasive, non-intrusive adjunct in evaluating and titrating the stress response during simulation.


Author(s):  
Sharada C. Sajjan ◽  
Vijaya C

This paper presents phonetics of Kannada language and their classification based on time-frequency analysis. Each distinct sound of speech called phoneme is produced by changing the shape of the vocal tract tube. The resonances of the vocal tract tube called formant frequencies are responsible for producing different phonemes. It is observed that vowels (Swaragalu in Kannada) have clear formant structure and they are about 3 to 5 formant frequencies of significance below 5000 Hz. They are characterized by having high energy, maximum airflow and periodicity and are classified based on the location of formant frequencies. Consonants (Vyanjanagalu in Kannada) are classified based on voicing, place of articulation and manner of articulation. Time-frequency analysis reveals that there are totally 37 distinct phonemes in Kannada language.


2021 ◽  
Vol 4 (1) ◽  
pp. 1-12
Author(s):  
Lucas Tessaro ◽  
Cynthia Whissell

Literature from across academic disciplines has demonstrated significant links between emotional valence and language. For example, Whissell’s Dictionary of Affect in Language defines three dimensions upon which the emotionality of words is describable, and Ekman’s Theories of Emotion include the perception and internalization of facial expressions. The present study seeks to expand upon these works by exploring whether holding facial expressions alters the fundamental speech properties of spoken language. Nineteen (19) participants were seated in a soundproof chamber and were asked to speak a series of pseudowords containing target phonemes.  The participants spoke the pseudowords either holding no facial expression, smiling, or frowning, and the utterances recorded using a high-definition microphone and phonologically analysed using PRAAT analysis software. Analyses revealed a pervasive gender differences in frequency variables, where males showed lower fundamental but higher formant frequencies compared to females. Significant main effects were found within the fundamental and formant frequencies, but no effects were discerned for the intensity variable. While intricate, these results are indicative of an interaction between the activity of facial musculature when reflecting emotional valence and the sound properties of speech uttered simultaneously.


Sign in / Sign up

Export Citation Format

Share Document