speech identification
Recently Published Documents


TOTAL DOCUMENTS

144
(FIVE YEARS 36)

H-INDEX

21
(FIVE YEARS 2)

Author(s):  
Youssef Elfahm ◽  
Nesrine Abajaddi ◽  
Badia Mounir ◽  
Laila Elmaazouzi ◽  
Ilham Mounir ◽  
...  

<span>Many technology systems have used voice recognition applications to transcribe a speaker’s speech into text that can be used by these systems. One of the most complex tasks in speech identification is to know, which acoustic cues will be used to classify sounds. This study presents an approach for characterizing Arabic fricative consonants in two groups (sibilant and non-sibilant). From an acoustic point of view, our approach is based on the analysis of the energy distribution, in frequency bands, in a syllable of the consonant-vowel type. From a practical point of view, our technique has been implemented, in the MATLAB software, and tested on a corpus built in our laboratory. The results obtained show that the percentage energy distribution in a speech signal is a very powerful parameter in the classification of Arabic fricatives. We obtained an accuracy of 92% for non-sibilant consonants /f, χ, ɣ, ʕ, ћ, and h/, 84% for sibilants /s, sҁ, z, Ӡ and ∫/, and 89% for the whole classification rate. In comparison to other algorithms based on neural networks and support vector machines (SVM), our classification system was able to provide a higher classification rate.</span>


Author(s):  
Edward Ombui ◽  
◽  
Lawrence Muchemi ◽  
Peter Wagacha

This study examines the problem of hate speech identification in codeswitched text from social media using a natural language processing approach. It explores different features in training nine models and empirically evaluates their predictiveness in identifying hate speech in a ~50k human-annotated dataset. The study espouses a novel approach to handle this challenge by introducing a hierarchical approach that employs Latent Dirichlet Analysis to generate topic models that help build a high-level Psychosocial feature set that we acronym PDC. PDC groups similar meaning words in word families, which is significant in capturing codeswitching during the preprocessing stage for supervised learning models. The high-level PDC features generated are based on a hate speech annotation framework [1] that is largely informed by the duplex theory of hate [2]. Results obtained from frequency-based models using the PDC feature on the dataset comprising of tweets generated during the 2012 and 2017 presidential elections in Kenya indicate an f-score of 83% (precision: 81%, recall: 85%) in identifying hate speech. The study is significant in that it publicly shares a unique codeswitched dataset for hate speech that is valuable for comparative studies. Secondly, it provides a methodology for building a novel PDC feature set to identify nuanced forms of hate speech, camouflaged in codeswitched data, which conventional methods could not adequately identify.


Sensors ◽  
2021 ◽  
Vol 21 (23) ◽  
pp. 7859
Author(s):  
Fernando H. Calderón ◽  
Namrita Balani ◽  
Jherez Taylor ◽  
Melvyn Peignon ◽  
Yen-Hao Huang ◽  
...  

The permanent transition to online activity has brought with it a surge in hate speech discourse. This has prompted increased calls for automatic detection methods, most of which currently rely on a dictionary of hate speech words, and supervised classification. This approach often falls short when dealing with newer words and phrases produced by online extremist communities. These code words are used with the aim of evading automatic detection by systems. Code words are frequently used and have benign meanings in regular discourse, for instance, “skypes, googles, bing, yahoos” are all examples of words that have a hidden hate speech meaning. Such overlap presents a challenge to the traditional keyword approach of collecting data that is specific to hate speech. In this work, we first introduced a word embedding model that learns the hidden hate speech meaning of words. With this insight on code words, we developed a classifier that leverages linguistic patterns to reduce the impact of individual words. The proposed method was evaluated across three different datasets to test its generalizability. The empirical results show that the linguistic patterns approach outperforms the baselines and enables further analysis on hate speech expressions.


Author(s):  
Edward Ombui ◽  
◽  
Lawrence Muchemi ◽  
Peter Wagacha

Presidential campaign periods are a major trigger event for hate speech on social media in almost every country. A systematic review of previous studies indicates inadequate publicly available annotated datasets and hardly any evidence of theoretical underpinning for the annotation schemes used for hate speech identification. This situation stifles the development of empirically useful data for research, especially in supervised machine learning. This paper describes the methodology that was used to develop a multidimensional hate speech framework based on the duplex theory of hate [1] components that include distance, passion, commitment to hate, and hate as a story. Subsequently, an annotation scheme based on the framework was used to annotate a random sample of ~51k tweets from ~400k tweets that were collected during the August and October 2017 presidential campaign period in Kenya. This resulted in a goldstandard codeswitched dataset that could be used for comparative and empirical studies in supervised machine learning. The resulting classifiers trained on this dataset could be used to provide real-time monitoring of hate speech spikes on social media and inform data-driven decision-making by relevant security agencies in government.


2021 ◽  
Vol 11 (2) ◽  
pp. 179-191
Author(s):  
Bianca Bastos Cordeiro ◽  
Marcos Roberto Banhara ◽  
Carlos Maurício Cardeal Mendes ◽  
Fabiana Danieli ◽  
Ariane Laplante-Lévesque ◽  
...  

The Oticon Medical Neuro cochlear implant system includes the modes Opti Omni and Speech Omni, the latter providing beamforming (i.e., directional selectivity) in the high frequencies. Two studies compared sentence identification scores of adult cochlear implant users with Opti Omni and Speech Omni. In Study 1, a double-blind longitudinal crossover study, 12 new users trialed Opti Omni or Speech Omni (random allocation) for three months, and their sentence identification in quiet and noise (+10 dB signal-to-noise ratio) with the trialed mode were measured. The same procedure was repeated for the second mode. In Study 2, a single-blind study, 11 experienced users performed a speech identification task in quiet and at relative signal-to-noise ratios ranging from −3 to +18 dB with Opti Omni and Speech Omni. The Study 1 scores in quiet and in noise were significantly better with Speech Omni than with Opti Omni. Study 2 scores were significantly better with Speech Omni than with Opti Omni at +6 and +9 dB signal-to-noise ratios. Beamforming in the high frequencies, as implemented in Speech Omni, leads to improved speech identification in medium levels of background noise, where cochlear implant users spend most of their day.


Author(s):  
Bezoui Mouaz ◽  
Cherif Walid ◽  
Beni-Hssane Abderrahim ◽  
Elmoutaouakkil Abdelmajid

<p class="keywords"><span id="docs-internal-guid-6347807a-7fff-e7da-a2d6-74cb8393677f"><span>Arabic dialects differ substantially from modern standard arabic and each other in terms of phonology, morphology, lexical choice and syntax. This makes the identification of dialects from speeches a very difficult task. In this paper, we introduce a speech recognition system that automatically identifies the gender of speaker, the emphatic letter pronounced and also the diacritic of these emphatic letters given a sample of author’s speeches. Firstly we examined the performance of the single case classifier hidden markov models (HMM) applied to the samples of our data corpus. Then we evaluated our proposed approach KNN-DT which is a hybridization of two classifiers namely decision trees (DT) and k-nearest neighbors (KNN). Both models are singularly applied directly to the data corpus to recognize the emphatic letter of the sound and to the diacritic and the gender of the speaker. This hybridization proved quite interesting; it improved the speech recognition accuracy by more than 10% compared to state-of-the-art approaches.</span></span></p>


Sign in / Sign up

Export Citation Format

Share Document