scholarly journals Applying Multitask Deep Learning to Emotion Recognition in Speech

2021 ◽  
Vol 25 (1) ◽  
pp. 82-109
Author(s):  
A. V. Ryabinov ◽  
M. Yu. Uzdiaev ◽  
I. V. Vatamaniuk

Purpose of research. Emotions play one of the key roles in the regulation of human behaviour. Solving the problem of automatic recognition of emotions makes it possible to increase the effectiveness of operation of a whole range of digital systems such as security systems, human-machine interfaces, e-commerce systems, etc. At the same time, the low efficiency of modern approaches to recognizing emotions in speech can be noted. This work studies automatic recognition of emotions in speech applying machine learning methods.Methods. The article describes and tests an approach to automatic recognition of emotions in speech based on multitask learning of deep convolution neural networks of AlexNet and VGG architectures using automatic selection of the weight coefficients for each task when calculating the final loss value during learning. All the models were trained on a sample of the IEMOCAP dataset with four emotional categories of ‘anger’, ‘happiness’, ‘neutral emotion’, ‘sadness’. The log-mel spectrograms of statements processed by a specialized algorithm are used as input data.Results. The considered models were tested on the basis of numerical metrics: the share of correctly recognized instances, accuracy, completeness, f-measure. For all of the above metrics, an improvement in the quality of emotion recognition by the proposed model was obtained in comparison with the two basic single-task models as well as with known solutions. This result is achieved through the use of automatic weighting of the values of the loss functions from individual tasks when forming the final value of the error in the learning process.Conclusion. The resulting improvement in the quality of emotion recognition in comparison with the known solutions confirms the feasibility of applying multitask learning to increase the accuracy of emotion recognition models. The developed approach makes it possible to achieve a uniform and simultaneous reduction of errors of individual tasks, and is used in the field of emotions recognition in speech for the first time.

2019 ◽  
pp. 19-23
Author(s):  
I. M. Sinyaeva

This article investigates modern marketing communications with the emphasis on the importance of creating a model of communication management aimed at improving the results of market participation of the organization. Studying the transformation of the national economy, the trends of concentration of industrial and financial capital, digitalization and customer focus of national business are rightly noted. The author logically highlights the dominance of Internet advertising in the modern complex of communications, which for the first time in 2018 was ahead of TV advertising in terms of sales. Scientific novelty lies in the proposed model of management of a modern complex of marketing communications, which will allow management to increase the quality of customer service. Attention should be paid to the author’s emphasis on the importance of activating the feedback in the proposed model to adjust management decisions as a guarantor of the formation of demand, financial stability of the organization and the image of the organization.


2018 ◽  
Vol 2018 ◽  
pp. 1-10 ◽  
Author(s):  
Laith H. Baniata ◽  
Seyoung Park ◽  
Seong-Bae Park

In this research article, we study the problem of employing a neural machine translation model to translate Arabic dialects to Modern Standard Arabic. The proposed solution of the neural machine translation model is prompted by the recurrent neural network-based encoder-decoder neural machine translation model that has been proposed recently, which generalizes machine translation as sequence learning problems. We propose the development of a multitask learning (MTL) model which shares one decoder among language pairs, and every source language has a separate encoder. The proposed model can be applied to limited volumes of data as well as extensive amounts of data. Experiments carried out have shown that the proposed MTL model can ensure a higher quality of translation when compared to the individually learned model.


2010 ◽  
Vol 3 (1) ◽  
pp. 8-16 ◽  
Author(s):  
Atma Ram Singh ◽  
Debapriya Choudhury ◽  
Gora Ghosh ◽  
Sunil K. Srivastava

For the first time, an additive for conducting polymer membrane has been prepared from an Indian coal. Samla coal from Raniganj coalfield having 80% carbon on dmmf basis was first demineralised chemically and then degraded by oxidizing it with oxidizing agent followed by solubilization of oxy coal thus produced in polar organic solvent. The soluble part of the coal was dried and the solid residual part along with a neutral base material polyvinyl alcohol was used for making additive for conducting polymer membrane in the form of a coloured thin film. The quality of the film/membrane made has been extremely good and the conductance is in the range of 10-3 to 10-4 units (Scm-1). Due to non explanation of enhancement in carboxylic acid functional groups, on oxidation, from 1.2% to 18.3% by the Mazumdar's proposed model coal structure, an anomaly with respect to 9-10 position of phenanthrene is observed and reported here and a correction to this has been proposed which explains the experimental results.


2020 ◽  
Vol 5 (2) ◽  
pp. 463-478
Author(s):  
Elizabeth Crais ◽  
Melody Harrison Savage

Purpose The shortage of doctor of philosophy (PhD)–level applicants to fill academic and research positions in communication sciences and disorders (CSD) programs calls for a detailed examination of current CSD PhD educational practices and the generation of creative solutions. The intended purposes of the article are to encourage CSD faculty to examine their own PhD program practices and consider the perspectives of recent CSD PhD graduates in determining the need for possible modifications. Method The article describes the results of a survey of 240 CSD PhD graduates and their perceptions of the challenges and facilitators to completing a PhD degree; the quality of their preparation in research, teaching, and job readiness; and ways to improve PhD education. Results Two primary themes emerged from the data highlighting the need for “matchmaking.” The first time point of needed matchmaking is prior to entry among students, mentors, and expectations as well as between aspects of the program that can lead to students' success and graduation. The second important matchmaking need is between the actual PhD preparation and the realities of the graduates' career expectations, and those placed on graduates by their employers. Conclusions Within both themes, graduate's perspectives and suggestions to help guide future doctoral preparation are highlighted. The graduates' recommendations could be used by CSD PhD program faculty to enhance the quality of their program and the likelihood of student success and completion. Supplemental Material https://doi.org/10.23641/asha.11991480


Author(s):  
A. V. Ponomarev

Introduction: Large-scale human-computer systems involving people of various skills and motivation into the information processing process are currently used in a wide spectrum of applications. An acute problem in such systems is assessing the expected quality of each contributor; for example, in order to penalize incompetent or inaccurate ones and to promote diligent ones.Purpose: To develop a method of assessing the expected contributor’s quality in community tagging systems. This method should only use generally unreliable and incomplete information provided by contributors (with ground truth tags unknown).Results:A mathematical model is proposed for community image tagging (including the model of a contributor), along with a method of assessing the expected contributor’s quality. The method is based on comparing tag sets provided by different contributors for the same images, being a modification of pairwise comparison method with preference relation replaced by a special domination characteristic. Expected contributors’ quality is evaluated as a positive eigenvector of a pairwise domination characteristic matrix. Community tagging simulation has confirmed that the proposed method allows you to adequately estimate the expected quality of community tagging system contributors (provided that the contributors' behavior fits the proposed model).Practical relevance: The obtained results can be used in the development of systems based on coordinated efforts of community (primarily, community tagging systems). 


Author(s):  
Mohamad Hossein Pourhanifeh ◽  
Kazem Abbaszadeh-Goudarzi ◽  
Mohammad Goodarzi ◽  
Sara G.M. Piccirillo ◽  
Alimohammad Shafiee ◽  
...  

: Melanoma is the most life-threatening and aggressive class of skin malignancies. The incidence of melanoma has steadily increased. Metastatic melanoma is greatly resistant to standard anti-melanomatreatments such as chemotherapy, and 5-year survival rate of cases with melanoma who have metastatic form of disease is less than 10%. The contributing role of apoptosis, angiogenesis and autophagy in the pathophysiology of melanoma has been previously demonstrated. Thus, it is extremely urgent to search for complementary therapeutic approachesthat couldenhance the quality of life of subjects and reduce treatment resistance and adverse effects. Resveratrol, known as a polyphenol component present in grapes and some plants, has anti-cancer properties due to its function as an apoptosis inducer in tumor cells, and anti-angiogenic agent to prevent metastasis. However, more clinical trials should be conducted to prove resveratrol efficacy. : Herein, for first time, we summarize current knowledge of anti-cancerous activities of resveratrol in melanoma.


2021 ◽  
Vol 21 (1) ◽  
Author(s):  
Domenico Cuda ◽  
Sara Ghiselli ◽  
Alessandra Murri

Abstract Background Prevalence of hearing loss increases with age. Its estimated prevalence is 40–50 % in people over 75 years of age. Recent studies agree that declinein hearing threshold contribute to deterioration in sociality, sensitivity, cognition, and quality of life for elderly subjects. The aim of the study presented in this paper is to verify whether or not rehabilitation using first time applied Hearing Aids (HA) in a cohort of old people with hearing impairment improves both speech perception in a noisy environment over time and the overall health-related quality of life. Methods The monocentric, prospective, repeated measurements, single-subject, clinical observational study is to recruit 100 older adults, first-time HA recipients (≥ 65 years).The evaluation protocol is designed to analyze changes in specific measurement tools a year after the first HA usage in comparison with the evaluation before HA fitting. Evaluations will consist of multiparametric details collected through self-report questionnaires completed by the recipients and a series of commonly used audiometric measures and geriatric assessment tools. The primary indicator of changes in speech perception in noise to be used is the Italian version of Oldenburg Satz (OLSA) test whereas the indicator of changes in overall quality of life will be the Assessment of Quality of Life (AQoL) and Hearing Handicap Inventory for the Elderly (HHIE) questionnaires. The Montreal Cognitive Assessment (MoCA) will help in screening the cognitive state of the subjects. Discussion The protocol is designed to make use of measurement tools that have already been applied to the hearing-impaired population in order to compare the effects of HA rehabilitation in the older adults immediately before first HA usage (Pre) and after 1 year of experience (Post). This broad approach will lead to a greater understanding of how useful hearing influences the quality of life in older individuals, and therefore improves potentials for healthy aging. The data is to be analyzed by using an intrasubject endpoint comparison. Outcomes will be described and analyzed in detail. Trial registration This research was retrospectively registered underno. NCT04333043at ClinicalTrials.gov (http://www.clinicaltrials.gov/) on the 26 March 2020. This research has been registered with the Ethics Committee of the Area Vasta Emilia Nord under number 104, date of approval 17/07/2017.


2021 ◽  
Vol 11 (6) ◽  
pp. 2838
Author(s):  
Nikitha Johnsirani Venkatesan ◽  
Dong Ryeol Shin ◽  
Choon Sung Nam

In the pharmaceutical field, early detection of lung nodules is indispensable for increasing patient survival. We can enhance the quality of the medical images by intensifying the radiation dose. High radiation dose provokes cancer, which forces experts to use limited radiation. Using abrupt radiation generates noise in CT scans. We propose an optimal Convolutional Neural Network model in which Gaussian noise is removed for better classification and increased training accuracy. Experimental demonstration on the LUNA16 dataset of size 160 GB shows that our proposed method exhibit superior results. Classification accuracy, specificity, sensitivity, Precision, Recall, F1 measurement, and area under the ROC curve (AUC) of the model performance are taken as evaluation metrics. We conducted a performance comparison of our proposed model on numerous platforms, like Apache Spark, GPU, and CPU, to depreciate the training time without compromising the accuracy percentage. Our results show that Apache Spark, integrated with a deep learning framework, is suitable for parallel training computation with high accuracy.


Sign in / Sign up

Export Citation Format

Share Document