scholarly journals Surface Electromyography–Based Recognition, Synthesis, and Perception of Prosodic Subvocal Speech

Author(s):  
Jennifer M. Vojtech ◽  
Michael D. Chan ◽  
Bhawna Shiwani ◽  
Serge H. Roy ◽  
James T. Heaton ◽  
...  

Purpose This study aimed to evaluate a novel communication system designed to translate surface electromyographic (sEMG) signals from articulatory muscles into speech using a personalized, digital voice. The system was evaluated for word recognition, prosodic classification, and listener perception of synthesized speech. Method sEMG signals were recorded from the face and neck as speakers with ( n  = 4) and without ( n  = 4) laryngectomy subvocally recited (silently mouthed) a speech corpus comprising 750 phrases (150 phrases with variable phrase-level stress). Corpus tokens were then translated into speech via personalized voice synthesis ( n  = 8 synthetic voices) and compared against phrases produced by each speaker when using their typical mode of communication ( n  = 4 natural voices, n  = 4 electrolaryngeal [EL] voices). Naïve listeners ( n  = 12) evaluated synthetic, natural, and EL speech for acceptability and intelligibility in a visual sort-and-rate task, as well as phrasal stress discriminability via a classification mechanism. Results Recorded sEMG signals were processed to translate sEMG muscle activity into lexical content and categorize variations in phrase-level stress, achieving a mean accuracy of 96.3% ( SD  = 3.10%) and 91.2% ( SD  = 4.46%), respectively. Synthetic speech was significantly higher in acceptability and intelligibility than EL speech, also leading to greater phrasal stress classification accuracy, whereas natural speech was rated as the most acceptable and intelligible, with the greatest phrasal stress classification accuracy. Conclusion This proof-of-concept study establishes the feasibility of using subvocal sEMG-based alternative communication not only for lexical recognition but also for prosodic communication in healthy individuals, as well as those living with vocal impairments and residual articulatory function. Supplemental Material https://doi.org/10.23641/asha.14558481

Author(s):  
Konstantinos Travlos

Abstract I argue that insulation via managerial coordination is a key element in any explanation about the formation of political regions among states. The key role it plays is as a tool for the maintenance of intra-regional pacific relations in the face of diffusion and contagion processes, resulting from continued security linkages with excluded extra-regional states. In order to explore these dynamics, I propose a new reconceptualization of the concept of managerial coordination based on the basic framework concept mapping tool. This leads to clarity about what managerial coordination does as a dimension of insulation. It also necessitates a revamp of the scale of interstate managerial coordination as a measuring instrument of the intensity of collective intentionality toward insulation among the members of a region. I then map the region concept of durable security complex (DSC) as the scope for the enactment of managerial coordination, based on a review of existing region concepts in the new regionalist literature. I then conduct an ideographic proof-of-concept exercise on three DSCs in the presence or absence of managerial coordination. These are the Scandinavian states, the South Asian regional security complex, and the South American Norther Tier local hierarchy. The exercise provides indicators for a number of theoretical propositions worthy of future evaluation.


2018 ◽  
Vol 11 (5) ◽  
pp. 135
Author(s):  
Carmen Echazarreta Soler ◽  
Albert Costa Marcé

Economic crises have mainly affected the more vulnerable social sectors and created losses of freedom and inequality. Currently, most media are controlled by a relatively small group of companies around the world. In the face of this situation, networked society has accelerated the development of alternative communication models, which act as loudspeakers for citizens’ voices. The aim of this study is to describe the main features of the new forms of citizen expression, communication and cooperation, such as social networks, review sites, citizen journalism and the collaborative economy. It is concluded that in the face of these new challenges it is essential to continue to develop ethical principles of self-regulation to ensure the accuracy and thoroughness of new forms of communication on the Net.


2019 ◽  
Vol 2019 ◽  
pp. 1-9 ◽  
Author(s):  
Muhammad Tayab Khan ◽  
Hafeez Anwar ◽  
Farman Ullah ◽  
Ata Ur Rehman ◽  
Rehmat Ullah ◽  
...  

We propose drowsiness detection in real-time surveillance videos by determining if a person’s eyes are open or closed. As a first step, the face of the subject is detected in the image. In the detected face, the eyes are localized and filtered with an extended Sobel operator to detect the curvature of the eyelids. Once the curves are detected, concavity is used to tell whether the eyelids are closed or open. Consequently, a concave upward curve means the eyelid is closed whereas a concave downwards curve means the eye is open. The proposed method is also implemented on hardware in order to be used in real-time scenarios, such as driver drowsiness detection. The evaluation of the proposed method used three image datasets, where images in the first dataset have a uniform background. The proposed method achieved classification accuracy of up to 95% on this dataset. Another benchmark dataset used has significant variations based on face deformations. With this dataset, our method achieved classification accuracy of 70%. A real-time video dataset of people driving the car was also used, where the proposed method achieved 95% accuracy, thus showing its feasibility for use in real-time scenarios.


2019 ◽  
Vol 14 (2) ◽  
pp. 158-164 ◽  
Author(s):  
G. Emayavaramban ◽  
A. Amudha ◽  
T. Rajendran ◽  
M. Sivaramkumar ◽  
K. Balachandar ◽  
...  

Background: Identifying user suitability plays a vital role in various modalities like neuromuscular system research, rehabilitation engineering and movement biomechanics. This paper analysis the user suitability based on neural networks (NN), subjects, age groups and gender for surface electromyogram (sEMG) pattern recognition system to control the myoelectric hand. Six parametric feature extraction algorithms are used to extract the features from sEMG signals such as AR (Autoregressive) Burg, AR Yule Walker, AR Covariance, AR Modified Covariance, Levinson Durbin Recursion and Linear Prediction Coefficient. The sEMG signals are modeled using Cascade Forward Back propagation Neural Network (CFBNN) and Pattern Recognition Neural Network. Methods: sEMG signals generated from forearm muscles of the participants are collected through an sEMG acquisition system. Based on the sEMG signals, the type of movement attempted by the user is identified in the sEMG recognition module using signal processing, feature extraction and machine learning techniques. The information about the identified movement is passed to microcontroller wherein a control is developed to command the prosthetic hand to emulate the identified movement. Results: From the six feature extraction algorithms and two neural network models used in the study, the maximum classification accuracy of 95.13% was obtained using AR Burg with Pattern Recognition Neural Network. This justifies that the Pattern Recognition Neural Network is best suited for this study as the neural network model is specially designed for pattern matching problem. Moreover, it has simple architecture and low computational complexity. AR Burg is found to be the best feature extraction technique in this study due to its high resolution for short data records and its ability to always produce a stable model. In all the neural network models, the maximum classification accuracy is obtained for subject 10 as a result of his better muscle fitness and his maximum involvement in training sessions. Subjects in the age group of 26-30 years are best suited for the study due to their better muscle contractions. Better muscle fatigue resistance has contributed for better performance of female subjects as compared to male subjects. From the single trial analysis, it can be observed that the hand close movement has achieved best recognition rate for all neural network models. Conclusion: In this paper a study was conducted to identify user suitability for designing hand prosthesis. Data were collected from ten subjects for twelve tasks related to finger movements. The suitability of the user was identified using two neural networks with six parametric features. From the result, it was concluded thatfit women doing regular physical exercises aged between 26-30 years are best suitable for developing HMI for designing a prosthetic hand. Pattern Recognition Neural Network with AR Burg extraction features using extension movements will be a better way to design the HMI. However, Signal acquisition based on wireless method is worth considering for the future.


2014 ◽  
Vol 631-632 ◽  
pp. 474-477
Author(s):  
Hui Yun Xiong ◽  
Juan Zhao

Image recognition has been a research hotspot in the field of machine learning; this paper puts forward a kind of cascade algorithm based on SVM and AdaBoost. The algorithm to select the sample pretreatment, fixed size of window image segmentation into different areas, then using Haar - like rectangular figure characteristics of integral method for feature extraction, finally using AdaBoost cascade classifier to classify the SVM training. Through the face recognition experiments show AdaBoost cascade of SVM algorithm improve the classification accuracy, error rate get reduced obviously.


Sensors ◽  
2021 ◽  
Vol 21 (16) ◽  
pp. 5385
Author(s):  
Tianyang Zhong ◽  
Donglin Li ◽  
Jianhui Wang ◽  
Jiacan Xu ◽  
Zida An ◽  
...  

Surface electromyogram (sEMG) signals have been used in human motion intention recognition, which has significant application prospects in the fields of rehabilitation medicine and cognitive science. However, some valuable dynamic information on upper-limb motions is lost in the process of feature extraction for sEMG signals, and there exists the fact that only a small variety of rehabilitation movements can be distinguished, and the classification accuracy is easily affected. To solve these dilemmas, first, a multiscale time–frequency information fusion representation method (MTFIFR) is proposed to obtain the time–frequency features of multichannel sEMG signals. Then, this paper designs the multiple feature fusion network (MFFN), which aims at strengthening the ability of feature extraction. Finally, a deep belief network (DBN) was introduced as the classification model of the MFFN to boost the generalization performance for more types of upper-limb movements. In the experiments, 12 kinds of upper-limb rehabilitation actions were recognized utilizing four sEMG sensors. The maximum identification accuracy was 86.10% and the average classification accuracy of the proposed MFFN was 73.49%, indicating that the time–frequency representation approach combined with the MFFN is superior to the traditional machine learning and convolutional neural network.


Sensors ◽  
2020 ◽  
Vol 20 (3) ◽  
pp. 672 ◽  
Author(s):  
Lin Chen ◽  
Jianting Fu ◽  
Yuheng Wu ◽  
Haochen Li ◽  
Bin Zheng

By training the deep neural network model, the hidden features in Surface Electromyography(sEMG) signals can be extracted. The motion intention of the human can be predicted by analysis of sEMG. However, the models recently proposed by researchers often have a large number of parameters. Therefore, we designed a compact Convolution Neural Network (CNN) model, which not only improves the classification accuracy but also reduces the number of parameters in the model. Our proposed model was validated on the Ninapro DB5 Dataset and the Myo Dataset. The classification accuracy of gesture recognition achieved good results.


Sensors ◽  
2019 ◽  
Vol 19 (10) ◽  
pp. 2370 ◽  
Author(s):  
Hyun-Joon Yoo ◽  
Hyeong-jun Park ◽  
Boreom Lee

Surface electromyography (sEMG) signals comprise electrophysiological information related to muscle activity. As this signal is easy to record, it is utilized to control several myoelectric prostheses devices. Several studies have been conducted to process sEMG signals more efficiently. However, research on optimal algorithms and electrode placements for the processing of sEMG signals is still inconclusive. In addition, very few studies have focused on minimizing the number of electrodes. In this study, we investigated the most effective method for myoelectric signal classification with a small number of electrodes. A total of 23 subjects participated in the study, and the sEMG data of 14 different hand movements of the subjects were acquired from targeted muscles and untargeted muscles. Furthermore, the study compared the classification accuracy of the sEMG data using discriminative feature-oriented dictionary learning (DFDL) and other conventional classifiers. DFDL demonstrated the highest classification accuracy among the classifiers, and its higher quality performance became more apparent as the number of channels decreased. The targeted method was superior to the untargeted method, particularly when classifying sEMG signals with DFDL. Therefore, it was concluded that the combination of the targeted method and the DFDL algorithm could classify myoelectric signals more effectively with a minimal number of channels.


2021 ◽  
Vol 25 (6) ◽  
pp. 1603-1627
Author(s):  
Xiao Yao ◽  
Zhengyan Sheng ◽  
Min Gu ◽  
Haibin Wang ◽  
Ning Xu ◽  
...  

In order to improve the robustness of speech recognition systems, this study attempts to classify stressed speech caused by the psychological stress under multitasking workloads. Due to the transient nature and ambiguity of stressed speech, the stress characteristics is not represented in all the segments in stressed speech as labeled. In this paper, we propose a multi-feature fusion model based on the attention mechanism to measure the importance of segments for stress classification. Through the attention mechanism, each speech frame is weighted to reflect the different correlations to the actual stressed state, and the multi-channel fusion of features characterizing the stressed speech to classify the speech under stress. The proposed model further adopts SpecAugment in view of the feature spectrum for data augment to resolve small sample sizes problem among stressed speech. During the experiment, we compared the proposed model with traditional methods on CASIA Chinese emotion corpus and Fujitsu stressed speech corpus, and results show that the proposed model has better performance in speaker-independent stress classification. Transfer learning is also performed for speaker-dependent classification for stressed speech, and the performance is improved. The attention mechanism shows the advantage for continuous speech under stress in authentic context comparing with traditional methods.


2020 ◽  
Vol 4 (2) ◽  
pp. 25
Author(s):  
Amin Derakhshan ◽  
Mohammad Mikaeili ◽  
Tom Gedeon ◽  
Ali Motie Nasrabadi

Facial thermal imaging is a non-contact technology which can be useful for ubiquitous deceptive anxiety recognition. To date, studies investigating this technology have produced equivocal results in classification accuracy and finding the most correlated regions on the face. This study was conducted using our dataset with 41 subjects using two different protocols and three modalities (thermal, GSR and PPG). We selected and tracked five regions of interest (ROI) on each facial thermal imprint including periorbital, forehead, cheek, perinasal and chin that were mostly used in previous papers. By employing six statistical features, four feature reduction techniques and three classifiers, we attempted to identify the ROIs which are mostly associated with activation of the sympathetic nervous system to increase the final classification accuracy rate. The results of linear classification models show significant improvement of classification accuracy by using ROC feature selection method. We achieved 90.1% and 74.7% accuracy rate for thermal features in mock crime and best friend scenarios, respectively. Our experimental results show that perinasal and cheek areas have greater discriminatory power in comparison with other ROIs on the face.


Sign in / Sign up

Export Citation Format

Share Document