scholarly journals Voice Pathology Identification using Deep Neural Networks

2019 ◽  
Vol 8 (4) ◽  
pp. 7447-7450

The human voice construction is a complex biological mechanism capable of Changing pitch and volume. Some Internal or External factors frequently damage the vocal cords and change quality of voice or do some alteration in the voice modulation. The effects are reflected in expression of speech and understanding of information said by the person. So it is important to examine problem at early stages of voice change and overcome from this problem. ML play a major role in identifying whether voice is pathological or normal in nature. Voice features are extracted by Implementing Mel-frequency Cepstral Coefficients (MFCC) method, and examined on the Convolutional Neural Network (CNN) to identify the category of voice.

1899 ◽  
Vol 22 ◽  
pp. 71-87
Author(s):  
John G. M'Kendrick

The quality of the human voice depends on the same laws as those determining the quality, klang-tint, or timbre of the tones produced by any musical instrument. Tones of a mixed character, that is to say, composed of a fundamental and partials, are produced by the vibrations of the true vocal cords, and certain of those partials are strengthened by the resonance of the air in the air-passages, and in the pharyngeal and oral cavities.So strongly may certain of these partials be reinforced, as to obscure or hide the fundamental tone, and give a peculiar character to the sound. These, however, are only general statements, and there are still many difficulties in the way of a true interpretation of voice-tones. In the first place, we observe that we may sing a scale, using one sound for each note, such as la, la, la, etc. Or, by putting the mouth in a certain position, we can pronounce the so-called vowels, a, e, i, o, u (ou as the u in prune), uttering the sounds ah, ā, ē, o, ou. As we do so, we notice that each sound appears to the ear to have a pitch of its own, different from that of the others.


2005 ◽  
Vol 13 (2) ◽  
pp. 76-78 ◽  
Author(s):  
Slobodan Mitrovic ◽  
Ljiljana Jovancevic

The voice of patients indicated for surgical procedures in treating of dysphonia is already damaged before the operation. The problem, which exists at the level of glottis patients usually try to solve by compensative mechanisms. The quality of voice after the interventions in larynx depends on the type and width of resection, disturbance of physiological phonation mechanisms, and ability to establish optimal phonation automatism. The damage of laryngeal structure, especially its glottic part and vocal cords as its central part, no matter if they are just fibrous or they are partially or totally absent, leads into the development of substitutive phonation mechanisms. The most frequent substitutive mechanisms are: vestibular, ventricular, and chordoventricular phonation. There are some variations of these phonation mechanisms, which are conditioned not only by applied surgical technique, but as they are also individual characteristics, they can be the consequence of applied rehabilitation methods. The diagnosis of voice condition before and after the oncosurgical procedure is done by: laryngostroboscopy, subjective acoustic analysis of voice, and objective acoustic analysis of voice (sonography or computer analysis of acoustic signal). The most of laryngeal carcinomas appear in glottic region, so the function of phonation imposes itself as the objective parameter to measure the quality of life after the oncosurgery of larynx. That is the reason why according to the priority, it is just behind the principle of "oncologic radicalism". Phonation as the most complex laryngeal function seems to have secondary importance. All known operative techniques, especially partial resections, have the preservation of phonation as their goal.


Author(s):  
Shashidhar S. Suligavi ◽  
Shoeb Alam

<p class="abstract"><strong>Background:</strong> Disorders of the voice commonly affect the quality of life of the person. The objective of the study was to find out the incidence and features of disorders of vocal cords presented in the OPD with hoarseness of voice.</p><p class="abstract"><strong>Methods:</strong> A study comprising of 45 cases of hoarseness of voice is carried out in the department of otorhinolaryngology in SNMC Bagalkot between January 2018 to June 2019 to evaluate the disorders of change in voice. A total of 45 patients came to OPD and indirect laryngoscopy was done to the patient and confirmed with flexible fibreoptic examination.  </p><p class="abstract"><strong>Results:</strong> Age of patients ranges between 8-75 years. There was a slight male predominance seen in the study. Housewives (29%) constitutes single largest group followed by farmers (22%), teachers and labourers. Duration ranges from 6 days to 15 years with 64% patients present with more than 3 months of duration. Voice abuse constitutes single largest precipitating factors followed by tobacco and smoking along with gastrolaryngeal reflex. 78% have single habits and 22% have multiple habits.</p><p class="abstract"><strong>Conclusions:</strong> Maximum no of patients were of infectious group followed by benign lesions and laryngeal palsy.</p><p class="abstract"> </p>


2019 ◽  
Vol 128 (12) ◽  
pp. 1104-1110 ◽  
Author(s):  
Rudolf Reiter ◽  
Adrienne Heyduck ◽  
Thomas Karl Hoffmann ◽  
Sibylle Brosch ◽  
Maria Anna Buchberger ◽  
...  

Objectives: This study is set to analyze clinicopathological factors predicting the recovery of unilateral vocal fold paralysis (UVP) in patients after thyroid gland surgery. The quality of voice was additionally assessed in these patients. Methods: The charts and videolaryngostroboscopy (VLS) examinations of 84 consecutive patients with a complete UVP after surgery of the thyroid gland were retrospectively reviewed. Patients were divided into 2 groups: patients who fully recovered from vocal fold paralysis and those who failed to recover after a follow-up of 12 months. The quality of voice was analyzed among other things by determining the Voice Handicap Index (VHI). Results: The UVP fully recovered in 52 of 84 (61.9%) patients. Positive mucosal waves (pMWs) on the paralyzed side, a minimal glottic gap <3 mm seen at the first postoperative VLS, age ≤50 years, and surgery duration ≤120 minutes were associated factors for a complete recovery of nerve function. The voice parameters improved independently from recovery of the paralysis in 90% of the patients. Conclusions: For patients with a poor prognosis of a UVP, early intervention may be beneficial. Thus, predicting factors for a full recovery of vocal fold motion would be a valuable tool. In our cohort, about 60% of recoveries could have been predicted using the above-mentioned parameters. Good quality of voice was independently reached in 90% of the cases.


2003 ◽  
Vol 131 (1-2) ◽  
pp. 40-42 ◽  
Author(s):  
Slobodan Mitrovic

The goal of psycho acoustic or subjective voice analysis, in a phoniater's everyday work, is to describe a subjective experience based on the physical parameters created in the process of phonation. The work was a clinical prospective study and the sample consisted of 80 people of both sexes, 40 people with benign and pseudo tumors of vocal cords and 40 people with malign tumors of vocal cords. All the patients were otorinolaringologically and phoniatrically examined. The subjective acoustic analysis was done with the patients pronouncing numbers from 1 to 10 in the comfortable zone. Afterwards, the quality of the voices was estimated in RBH scale. The subjective acoustic analysis found roughness in the voices of 87,50% patients in the first group and the most frequent value was Mod=3 ( intense roughness), 62,50% patients. Hoarseness was present in 90,00 % cases , with largest value Mod=2 (moderate hoarseness), 55,00% patients. In the second group, roughness existed in the voices of 70,00% patients, most often intense one (Mod=3), 30,00% patients. Hoarseness existed in 95,00% cases, 45,00% with moderate (Mod=2) and 35,00% with intense one. T test showed that there is a statistically significant difference between the strength of the roughness determined by the subjective acoustic analysis in the first and the second group, with p<0,01. The difference between the strength of the hoarseness in the first and the second group is also statistically significant, with p<0,01. All the growths on vocal cords irrespective of their nature change the characteristics of the voice, most of all its clearness. In cases of vocal cords tumors, by the subjective acoustic analysis, i.e. the perception of the psycho physiological characteristics of voice, a human ear can register pathological phenomena of the voice but cannot determine the character of the growth on the vocal cords.


Author(s):  
Sophie K. Scott

The networks of cortical and subcortical fields that contribute to speech production have benefitted from many years of detailed study, and have been used as a framework for human volitional vocal production more generally. In this article, I will argue that we need to consider speech production as an expression of the human voice in a more general sense. I will also argue that the neural control of the voice can and should be considered to be a flexible system, into which more right hemispheric networks are differentially recruited, based on the factors that are modulating vocal production. I will explore how this flexible network is recruited to express aspects of non-verbal information in the voice, such as identity and social traits. Finally, I will argue that we need to widen out the kinds of vocal behaviours that we explore, if we want to understand the neural underpinnings of the true range of sound-making capabilities of the human voice. This article is part of the theme issue ‘Voice modulation: from origin and mechanism to social impact (Part II)’.


2018 ◽  
Vol 7 (2) ◽  
pp. 205-225
Author(s):  
Melanie Strumbl

This article proposes that music and sound do not possess discursive meaning in a hermeneutical sense per se but should be recognized as a performative process. Consequently, music does not necessarily have meaning; rather, its essence is the moving of affects, a process of doing. The realization of music as a body in movement that affects other bodies is crucial to the understanding of the symbiosis of voice, sound, and body. Reading Jean-Luc Nancy's Listening enables a way of thinking about rhythm and timbre, sound, resonance and noise, voice and instrument, and ultimately song altogether that — connected with affect studies — might show the affectivity of resonating bodies and voices. Furthermore, Roland Barthes's essay The Grain of the Voice is an oft-cited and clairvoyant analysis of vocal sound that can be fruitfully combined with Nancy's philosophical treatise on the act of listening. Finally, notions of the affective turn will be linked with post-structuralist hermeneutics of the sonic quality of voice and sound. Against this theoretical backdrop, an endeavor to tackle the specific affective quality of voice and timbre should be made. Spectral analysis serves as an analytical tool to demystify the aesthetic appeal of singers like Janis Joplin, whose renditions are perceived as very emotional and authentic, due to her unique timbre and unique style of singing. Ultimately, the article is aiming towards discovering congruencies between aesthetic judgements on specific vocal artists and the sonic visualization of interpretation and vocal qualities thereof. Firstly, combining those two methodological approaches makes possible an analysis of the affective quality, the jouissance of an artist's mesmerizing voice and their aesthetic charm and, secondly, proves useful for cultural historians in terms of approaching an aesthetic phenomenon that has had relevance throughout the history of popular music.


2019 ◽  
Vol 26 (1) ◽  
pp. e100075 ◽  
Author(s):  
Emily Couvillon Alagha ◽  
Rachel Renee Helbing

ObjectiveTo assess the quality and accuracy of the voice assistants (VAs), Amazon Alexa, Siri and Google Assistant, in answering consumer health questions about vaccine safety and use.MethodsResponses of each VA to 54 questions related to vaccination were scored using a rubric designed to assess the accuracy of each answer provided through audio output and the quality of the source supporting each answer.ResultsOut of a total of 6 possible points, Siri averaged 5.16 points, Google Assistant averaged 5.10 points and Alexa averaged 0.98 points. Google Assistant and Siri understood voice queries accurately and provided users with links to authoritative sources about vaccination. Alexa understood fewer voice queries and did not draw answers from the same sources that were used by Google Assistant and Siri.ConclusionsThose involved in patient education should be aware of the high variability of results between VAs. Developers and health technology experts should also push for greater usability and transparency about information partnerships as the health information delivery capabilities of these devices expand in the future.


2021 ◽  
Vol 36 (2) ◽  
pp. 65-97
Author(s):  
Lisa Åkervall

Abstract This essay takes the auto-tuned viral video “Can't Hug Every Cat” as a point of entry for a broader analysis of how modulation decisively shapes politics, aesthetics, and gendering in contemporary digital ecologies. It uncovers how the exaggerated exhibitions of feminine vocal modulation in “Can't Hug Every Cat” entangle with generational feminist anxieties over gendered forms of articulation such as “sexy baby voice” and “upspeak.” It argues that the problematic of the modulated voice is both technologically and thematically central to political, technological, aesthetic, and gendered genealogies of media-technical modulation. The modulated voice given such extraordinary staging in “Can't Hug Every Cat” is therefore restored to the longer history of voice modulation, which is itself closely tied to the rise of control societies and digital media. In this perspective, techniques of voice modulation and social modulation are tandem technologies. The voice modulation that has figured prominently in media cultures in recent decades—from the music of Cher to T-Pain and beyond—is not merely a consequence of digital media and control societies but is also integral to their conditions of possibility. In this light, the rise of technologies for the modulation of the human voice since the nineteenth century is intertwined with the rise of new economic, political, and medical systems of control.


2003 ◽  
Vol 42 (147) ◽  
pp. 133-136
Author(s):  
Toran KC ◽  
S Shrestha

ABSTRACTMedialization technique has remained as a mainstay for the treatment of glottal insufficiency. This form oflaryngeal framework surgery not only improves the quality of voice but also protects the lungs from aspiration.We present six patients who underwent vocal cord medialization surgery. Of the six patients only onepatient had required revision surgery. Since these form of surgeries are performed under local anesthesiathe quality of the voice can be assessed per-operatively. This kind of surgery appears to be still new in the Nepalese context.Key Words: Isshiki Thyroplasty, Laryngoplasty, Laryngeal framework surgery,vocal cord paralysis, Voice disorders, glottal insufficiency.


Sign in / Sign up

Export Citation Format

Share Document