Comparative analysis of high-speed videolaryngoscopy images and sound data simultaneously acquired from rigid and flexible laryngoscope: a pilot study

AbstractHigh-Speed Videoendoscopy (HSV) is becoming a robust tool for the assessment of vocal fold vibration in laboratory investigation and clinical practice. We describe the first successful application of flexible High Speed Videoendoscopy with innovative laser light source conducted in clinical settings. The acquired image and simultaneously recorded audio data are compared to the results obtained by means of a rigid endoscope. We demonstrated that the HSV recordings with fiber-optic laryngoscope have enabled obtaining consistently bright, color images suitable for parametrization of vocal fold oscillation similarly as in the case of the HSV data obtained from a rigid laryngoscope. The comparison of period and amplitude perturbation parameters calculated on the basis of image and audio data acquired from flexible and rigid HSV recording objectively confirm that flexible High-Speed Videoendoscopy is a more suitable method for examination of natural phonation. The HSV-based measures generated from this kymographic analysis are arguably a superior representation of the vocal fold vibrations than the acoustic analysis because their quantification is independent of the vocal tract influences. This experimental study has several implications for further research in the field of HSV application in clinical assessment of glottal pathologies nature and its effect on vocal folds vibrations.

Download Full-text

Empirical Eigenfunctions and Medial Surface Dynamics of a Human Vocal Fold

Methods of Information in Medicine ◽

10.1055/s-0038-1633981 ◽

2005 ◽

Vol 44 (03) ◽

pp. 384-391 ◽

Cited By ~ 30

Author(s):

N. Tayama ◽

D. A. Berry ◽

M. Döllinger

Keyword(s):

Vocal Fold ◽

High Speed ◽

Computational Models ◽

Imaging System ◽

Vocal Folds ◽

Sustained Oscillation ◽

Medial Surface ◽

Physical Mechanisms ◽

Vocal Fold Vibration ◽

Modes Of Vibration

Summary Objectives: The purpose of this investigation was to use an excised human larynx to substantiate physical mechanisms of sustained vocal fold oscillation over a variety of phonatory conditions. During sustained, flow-induced oscillation, dynamical data was collected from the medial surface of the vocal fold. The method of Empirical Eigenfunctions was used to analyze the data and to probe physical mechanisms of sustained oscillation. Methods: Thirty microsutures were mounted on the medial margin of a human vocal fold. Across five distinct phonatory conditions, the vocal fold was set into oscillation and imaged with a high-speed digital imaging system. The position coordinates of the sutures were extracted from the images and converted into physical coordinates. Empirical Eigenfunctions were computed from the time-varying physical coordinates, and mechanisms of sustained oscillation were explored. Results: Using the method of Empirical Eigenfunctions, physical mechanisms of sustained vocal fold oscillation were substantiated. In particular, the essential dynamics of vocal fold vibration were captured by two dominant Empirical Eigenfunctions. The largest Eigenfunction primarily captured the alternating convergent/ divergent shape of the medial surface of the vocal fold, while the second largest Eigenfunction primarily captured the lateral vibrations of the vocal fold. Conclusions: The hemi-larynx setup yielded a view of the medial surface of the vocal folds, revealing the tissue vibrations which produced sound. Through the use of Empirical Eigenfunctions, the underlying modes of vibration were computed, disclosing physical mechanisms of sustained vocal fold oscillation. The investigation substantiated previous theoretical analyses and yielded significant data to help evaluate and refine computational models of vocal fold vibration.

Download Full-text

Influence of Analyzed Sequence Length on Parameters in Laryngeal High-Speed Videoendoscopy

Applied Sciences ◽

10.3390/app8122666 ◽

2018 ◽

Vol 8 (12) ◽

pp. 2666 ◽

Cited By ~ 5

Author(s):

Patrick Schlegel ◽

Marion Semmler ◽

Melda Kunduk ◽

Michael Döllinger ◽

Christopher Bohr ◽

...

Keyword(s):

Vocal Fold ◽

High Speed ◽

Vocal Folds ◽

Frame Rate ◽

Sequence Length ◽

Influence Parameter ◽

Variability Index ◽

Perturbation Parameters ◽

Almost All ◽

Healthy Females

Laryngeal high-speed videoendoscopy (HSV) allows objective quantification of vocal fold vibratory characteristics. However, it is unknown how the analyzed sequence length affects some of the computed parameters. To examine if varying sequence lengths influence parameter calculation, 20 HSV recordings of healthy females during sustained phonation were investigated. The clinical prevalent Photron Fastcam MC2 camera with a frame rate of 4000 fps and a spatial resolution of 512 × 256 pixels was used to collect HSV data. The glottal area waveform (GAW), describing the increase and decrease of the area between the vocal folds during phonation, was extracted. Based on the GAW, 16 perturbation parameters were computed for sequences of 5, 10, 20, 50 and 100 consecutive cycles. Statistical analysis was performed using SPSS Statistics, version 21. Only three parameters (18.8%) were statistically significantly influenced by changing sequence lengths. Of these parameters, one changed until 10 cycles were reached, one until 20 cycles were reached and one, namely Amplitude Variability Index (AVI), changed between almost all groups of different sequence lengths. Moreover, visually observable, but not statistically significant, changes within parameters were observed. These changes were often most prominent between shorter sequence lengths. Hence, we suggest using a minimum sequence length of at least 20 cycles and discarding the parameter AVI.

Download Full-text

High-Speed Imaging of Vocal Fold Vibrations and Larynx Movements within Vocalizations of Different Vowels

Annals of Otology Rhinology & Laryngology ◽

10.1177/000348949610501208 ◽

1996 ◽

Vol 105 (12) ◽

pp. 975-981 ◽

Cited By ~ 16

Author(s):

Dieter Maurer ◽

Markus Hess ◽

Manfred Gross

Keyword(s):

Fundamental Frequency ◽

Digital Imaging ◽

Vocal Fold ◽

High Speed ◽

Vocal Tract ◽

Vocal Folds ◽

High Speed Imaging ◽

Filter Model ◽

Acoustic Interaction ◽

Glottal Source

Theoretic investigations of the “source-filter” model have indicated a pronounced acoustic interaction of glottal source and vocal tract. Empirical investigations of formant pattern variations apart from changes in vowel identity have demonstrated a direct relationship between the fundamental frequency and the patterns. As a consequence of both findings, independence of phonation and articulation may be limited in the speech process. Within the present study, possible interdependence of phonation and phoneme was investigated: vocal fold vibrations and larynx position for vocalizations of different vowels in a healthy man and woman were examined by high-speed light-intensified digital imaging. We found 1) different movements of the vocal folds for vocalizations of different vowel identities within one speaker and at similar fundamental frequency, and 2) constant larynx position within vocalization of one vowel identity, but different positions for vocalizations of different vowel identities. A possible relationship between the vocal fold vibrations and the phoneme is discussed.

Download Full-text

Measurement of Vocal Fold Collision Forces During Phonation

Journal of Speech Language and Hearing Research ◽

10.1044/1092-4388(2005/039) ◽

2005 ◽

Vol 48 (3) ◽

pp. 567-576 ◽

Cited By ~ 20

Author(s):

Heather E. Gunter ◽

Robert D. Howe ◽

Steven M. Zeitels ◽

James B. Kobler ◽

Robert E. Hillman

Keyword(s):

Vocal Fold ◽

Vocal Tract ◽

Tissue Injury ◽

Vocal Folds ◽

Force Sensor ◽

Vocal Fold Vibration ◽

Low Profile ◽

Order Of Magnitude ◽

Collision Force

Forces applied to vocal fold tissue as the vocal folds collide may cause tissue injury that manifests as benign organic lesions. A novel method for measuring this quantity in humans in vivo uses a low-profile force sensor that extends along the length and depth of the glottis. Sensor design facilitates its placement and stabilization so that phonation can be initiated and maintained while it is in place, with minimal interference in vocal fold vibration. In 2 individuals with 1 vibrating vocal fold and 1 nonvibrating vocal fold, peak collision force correlates more strongly with voice intensity than pitch. Vocal fold collision forces in 1 individual with 2 vibrating vocal folds are of the same order of magnitude as in previous studies. Correlations among peak collision force, voice intensity, and pitch were indeterminate in this participant because of the small number of data points. Sensor modifications are proposed so that it can be used to reliably estimate collision force in individuals with 2 vibrating vocal folds and with changing vocal tract conformations.

Download Full-text

High-Speed Imaging to Study an Auto-Oscillating Vocal Fold Replica for Different Initial Conditions

International Journal of Applied Mechanics ◽

10.1142/s1758825117500648 ◽

2017 ◽

Vol 09 (05) ◽

pp. 1750064 ◽

Cited By ~ 2

Author(s):

A. Van Hirtum ◽

X. Pelorson

Keyword(s):

Vocal Fold ◽

High Speed ◽

Initial Conditions ◽

Vocal Folds ◽

High Speed Imaging ◽

Human Voice ◽

Manual Intervention ◽

Geometrical Features ◽

Upstream Pressure

Experiments on mechanical deformable vocal folds replicas are important in physical studies of human voice production to understand the underlying fluid–structure interaction. At current date, most experiments are performed for constant initial conditions with respect to structural as well as geometrical features. Varying those conditions requires manual intervention, which might affect reproducibility and hence the quality of experimental results. In this work, a setup is described which allows setting elastic and geometrical initial conditions in an automated way for a deformable vocal fold replica. High-speed imaging is integrated in the setup in order to decorrelate elastic and geometrical features. This way, reproducible, accurate and systematic measurements can be performed for prescribed initial conditions of glottal area, mean upstream pressure and vocal fold elasticity. Moreover, quantification of geometrical features during auto-oscillation is shown to contribute to the experimental characterization and understanding.

Download Full-text

Reinke's Edema: Phonatory Mechanisms and Management Strategies

Annals of Otology Rhinology & Laryngology ◽

10.1177/000348949710600701 ◽

1997 ◽

Vol 106 (7) ◽

pp. 533-543 ◽

Cited By ~ 68

Author(s):

Steven M. Zeitels ◽

Glenn W. Bunting ◽

Robert E. Hillman ◽

Traci Vaughn

Keyword(s):

Lamina Propria ◽

Fundamental Frequency ◽

Vocal Fold ◽

Management Strategies ◽

Vocal Folds ◽

Vocal Fold Vibration ◽

Reinke’S Edema ◽

Subglottal Pressure ◽

Almost All ◽

Superficial Lamina

Reinke's edema (RE) has been associated typically with smoking and sometimes with vocal abuse, but aspects of the pathophysiology of RE remain unclear. To gain new insights into phonatory mechanisms associated with RE pathophysiology, weused an integrated battery of objective vocal function tests to analyze 20 patients (19 women) who underwent phonomicrosurgical resection. Preoperative stroboscopic examinations demonstrated that the superficial lamina propria is distended primarily on the superior vocal fold surface. Acoustically, these individuals have an abnormally low average speaking fundamental frequency (123 Hz), and they generate abnormally high average subglottal pressures (9.7 cm H20). The presence of elevated aerodynamic driving pressures reflects difficulties in producing vocal fold vibration that are most likely the result of mass loading associated with RE, and possibly vocal hyperfunction. Furthermore, it is hypothesized that in the environment of chronic glottal mucositis secondary to smoking and reflux, the cephalad force on the vocal folds by the subglottal driving pressure contributes to the superior distention of the superficial lamina propria. Surgical reduction of the volume of the superficial lamina propria resulted in a significant elevation in fundamental frequency (154 Hz) and improvement in perturbation measures. In almost all instances, both the clinician and the patient perceived the voice as improved. However, these patients continued to generate elevated subglottal pressure (probably a sign of persistent hyperfunction) that was accompanied by visually observed supraglottal strain despite the normalsized vocal folds. This finding suggests that persistent hyperfunctional vocal behaviors may contribute to postsurgical RE recurrence if therapeutic strategies are not instituted to modify such behavior.

Download Full-text

Effects of Voice Therapy as Objectively Evaluated by Digitized Laryngeal Stroboscopic Imaging

Annals of Otology Rhinology & Laryngology ◽

10.1177/000348940211101007 ◽

2002 ◽

Vol 111 (10) ◽

pp. 902-908 ◽

Cited By ~ 22

Author(s):

Renée Speyer ◽

Pieter A. Kempen ◽

George Wieneke ◽

Willem Kersing ◽

Elham Ghazi Hosseini ◽

...

Keyword(s):

Vocal Fold ◽

Vocal Folds ◽

Lesion Size ◽

Voice Therapy ◽

Benign Lesions ◽

Objective Measurements ◽

Significant Parameter ◽

Surface Areas ◽

Vocal Fold Vibration ◽

Maximal Opening

Objective measurements derived from digitized laryngeal stroboscopic images were used to demonstrate changes in vocal fold vibration and in the size of benign lesions after 3 months of voice therapy. Forty chronically dysphonic patients were studied. By means of a rigid stroboscope, pretreatment and posttreatment recordings were made of the vocal folds at rest and under stroboscopic light during phonation. From each recording, images of the positions at rest and during vibration at maximal opening and at maximal closure were digitized. The surface areas of any lesions and of the glottal gap were independently measured in the digitized images by 2 experienced laryngologists. Referential distances were determined in order to compensate for discrepancies in magnification in the various recordings. After 3 months of voice therapy, significant improvement in lesion size and degree of maximal closure during vibration could be demonstrated in about 50% of the patients. The degree of maximal opening did not prove to be a significant parameter.

Download Full-text

Videostrobolaryngoscopy of Mucus Layer during Vocal Fold Vibration in Patients with Laryngeal Tension-Fatigue Syndrome

Annals of Otology Rhinology & Laryngology ◽

10.1177/000348940211100610 ◽

2002 ◽

Vol 111 (6) ◽

pp. 537-541 ◽

Cited By ~ 12

Author(s):

Tzu-Yu Hsiao ◽

Chia-Ming Liu ◽

Kai-Nan Lin

Keyword(s):

Mechanical Properties ◽

Vocal Fold ◽

Rough Surfaces ◽

Voice Disorders ◽

Vocal Folds ◽

Voice Disorder ◽

Mucus Layer ◽

Vocal Fold Vibration ◽

Fatigue Syndrome ◽

Functional Dysphonia

The mucus layer on the vocal folds was examined by videostrobolaryngoscopy in patients with laryngeal tension-fatigue syndrome, a chronic functional dysphonia due to vocal abuse and misuse. Besides the findings in previous reports (such as abnormal glottal closure, phase or amplitude asymmetry, and the irregular mucosal wave), the vocal folds during vibration had an uneven mucus surface. The occurrence of an uneven mucus layer on vocal folds was significantly greater in subjects with this voice disorder (83% or 250 of 301 patients in this series) than in those without voice disorders (18.5% or 5 of 27). The increase of mucus viscosity, mucus aggregation, and the formation of rough surfaces on the vocal folds alter the mechanical properties that contribute to vibration of the cover of the vocal folds, and thereby worsen the symptoms of dysphonia in patients with laryngeal tension-fatigue syndrome.

Download Full-text

A Deep Learning Enhanced Novel Software Tool for Laryngeal Dynamics Analysis

Journal of Speech Language and Hearing Research ◽

10.1044/2021_jslhr-20-00498 ◽

2021 ◽

pp. 1-15

Author(s):

Andreas M. Kist ◽

Pablo Gómez ◽

Denis Dubrovskiy ◽

Patrick Schlegel ◽

Melda Kunduk ◽

...

Keyword(s):

Neural Networks ◽

Quantitative Analysis ◽

High Speed ◽

Voice Disorders ◽

Vocal Folds ◽

Video Data ◽

Audio Data ◽

Fully Automatic ◽

Video And Audio

Purpose High-speed videoendoscopy (HSV) is an emerging, but barely used, endoscopy technique in the clinic to assess and diagnose voice disorders because of the lack of dedicated software to analyze the data. HSV allows to quantify the vocal fold oscillations by segmenting the glottal area. This challenging task has been tackled by various studies; however, the proposed approaches are mostly limited and not suitable for daily clinical routine. Method We developed a user-friendly software in C# that allows the editing, motion correction, segmentation, and quantitative analysis of HSV data. We further provide pretrained deep neural networks for fully automatic glottis segmentation. Results We freely provide our software Glottis Analysis Tools (GAT). Using GAT, we provide a general threshold-based region growing platform that enables the user to analyze data from various sources, such as in vivo recordings, ex vivo recordings, and high-speed footage of artificial vocal folds. Additionally, especially for in vivo recordings, we provide three robust neural networks at various speed and quality settings to allow a fully automatic glottis segmentation needed for application by untrained personnel. GAT further evaluates video and audio data in parallel and is able to extract various features from the video data, among others the glottal area waveform, that is, the changing glottal area over time. In total, GAT provides 79 unique quantitative analysis parameters for video- and audio-based signals. Many of these parameters have already been shown to reflect voice disorders, highlighting the clinical importance and usefulness of the GAT software. Conclusion GAT is a unique tool to process HSV and audio data to determine quantitative, clinically relevant parameters for research, diagnosis, and treatment of laryngeal disorders. Supplemental Material https://doi.org/10.23641/asha.14575533

Download Full-text