speech interfaces
Recently Published Documents


TOTAL DOCUMENTS

82
(FIVE YEARS 16)

H-INDEX

10
(FIVE YEARS 2)

Sensors ◽  
2022 ◽  
Vol 22 (2) ◽  
pp. 649
Author(s):  
David Ferreira ◽  
Samuel Silva ◽  
Francisco Curado ◽  
António Teixeira

Speech is our most natural and efficient form of communication and offers a strong potential to improve how we interact with machines. However, speech communication can sometimes be limited by environmental (e.g., ambient noise), contextual (e.g., need for privacy), or health conditions (e.g., laryngectomy), preventing the consideration of audible speech. In this regard, silent speech interfaces (SSI) have been proposed as an alternative, considering technologies that do not require the production of acoustic signals (e.g., electromyography and video). Unfortunately, despite their plentitude, many still face limitations regarding their everyday use, e.g., being intrusive, non-portable, or raising technical (e.g., lighting conditions for video) or privacy concerns. In line with this necessity, this article explores the consideration of contactless continuous-wave radar to assess its potential for SSI development. A corpus of 13 European Portuguese words was acquired for four speakers and three of them enrolled in a second acquisition session, three months later. Regarding the speaker-dependent models, trained and tested with data from each speaker while using 5-fold cross-validation, average accuracies of 84.50% and 88.00% were respectively obtained from Bagging (BAG) and Linear Regression (LR) classifiers, respectively. Additionally, recognition accuracies of 81.79% and 81.80% were also, respectively, achieved for the session and speaker-independent experiments, establishing promising grounds for further exploring this technology towards silent speech recognition.


Electronics ◽  
2021 ◽  
Vol 10 (19) ◽  
pp. 2371
Author(s):  
Minho Kim ◽  
Youngim Jung ◽  
Hyuk-Chul Kwon

Speech processing technology has great potential in the medical field to provide beneficial solutions for both patients and doctors. Speech interfaces, represented by speech synthesis and speech recognition, can be used to transcribe medical documents, control medical devices, correct speech and hearing impairments, and assist the visually impaired. However, it is essential to predict prosody phrase boundaries for accurate natural speech synthesis. This study proposes a method to build a reliable learning corpus to train prosody boundary prediction models based on deep learning. In addition, we offer a way to generate a rule-based model that can predict the prosody boundary from the constructed corpus and use the result to train a deep learning-based model. As a result, we have built a coherent corpus, even though many workers have participated in its development. The estimated pairwise agreement of corpus annotations is between 0.7477 and 0.7916 and kappa coefficient (K) between 0.7057 and 0.7569. In addition, the deep learning-based model based on the rules obtained from the corpus showed a prediction accuracy of 78.57% for the three-level prosody phrase boundary, 87.33% for the two-level prosody phrase boundary.


2021 ◽  
Author(s):  
Beiming Cao ◽  
Nordine Sebkhi ◽  
Arpan Bhavsar ◽  
Omer T. Inan ◽  
Robin Samlan ◽  
...  

2021 ◽  
Author(s):  
Amin Honarmandi Shandiz ◽  
László Tóth ◽  
Gábor Gosztolya ◽  
Alexandra Markó ◽  
Tamás Gábor Csapó

2021 ◽  
Author(s):  
Inma Hernaez ◽  
Jose Andrés González-López ◽  
Eva Navas ◽  
Jose Luis Pérez Córdoba ◽  
Ibon Saratxaga ◽  
...  

Author(s):  
Leigh Clark ◽  
Benjamin R. Cowan ◽  
Abi Roper ◽  
Stephen Lindsay ◽  
Owen Sheers
Keyword(s):  

Author(s):  
Hans-Christian Schmitz ◽  
Frank Kurth ◽  
Kevin Wilkinghoff ◽  
Uwe Müllerschkowski ◽  
Christian Karrasch ◽  
...  
Keyword(s):  

Sign in / Sign up

Export Citation Format

Share Document