Metaheuristic adapted convolutional neural network for Telugu speaker diarization

2021 ◽  
pp. 1-17
Author(s):  
Sethuram V ◽  
Ande Prasad ◽  
R. Rajeswara Rao

In speech technology, a pivotal role is being played by the Speaker diarization mechanism. In general, speaker diarization is the mechanism of partitioning the input audio stream into homogeneous segments based on the identity of the speakers. The automatic transcription readability can be improved with the speaker diarization as it is good in recognizing the audio stream into the speaker turn and often provides the true speaker identity. In this research work, a novel speaker diarization approach is introduced under three major phases: Feature Extraction, Speech Activity Detection (SAD), and Speaker Segmentation and Clustering process. Initially, from the input audio stream (Telugu language) collected, the Mel Frequency Cepstral coefficient (MFCC) based features are extracted. Subsequently, in Speech Activity Detection (SAD), the music and silence signals are removed. Then, the acquired speech signals are segmented for each individual speaker. Finally, the segmented signals are subjected to the speaker clustering process, where the Optimized Convolutional Neural Network (CNN) is used. To make the clustering more appropriate, the weight and activation function of CNN are fine-tuned by a new Self Adaptive Sea Lion Algorithm (SA-SLnO). Finally, a comparative analysis is made to exhibit the superiority of the proposed speaker diarization work. Accordingly, the accuracy of the proposed method is 0.8073, which is 5.255, 2.45%, and 0.075, superior to the existing works.

2021 ◽  
Vol 3 (3) ◽  
pp. 178-193
Author(s):  
B. Vivekanandam

The invention of the first vaccine has also raised several anti-vaccination views among people. Vaccine reluctance may be exacerbated by the growing reliance on social media, which is considered as a source of health information. During this COVID'19 scenario, the verification of non-vaccinators via the use of biometric characteristics has received greater attention, especially in areas such as vaccination monitoring and other emergency medical services, among other things. The traditional digital camera utilizes the middle-resolution images for commercial applications in a regulated or contact-based environment with user participation, while the latter uses high-resolution latent palmprints. This research study attempts to utilize convolutional neural networks (CNN) for the first time to perform contactless recognition. To identify the COVID '19 vaccine using the CNN technique, this research work has used the contactless palmprint method. Further, this research study utilizes the PalmNet structure of convolutional neural network to resolve the issue. First, the ROI region of the palmprint was extracted from the input picture based on the geometric form of the print. After image registration, the ROI region is sent into a convolutional neural network as an input. The softmax activation function is then used to train the network so that it can choose the optimal learning rate and super parameters for the given learning scenario. The neural networks of the deep learning platform were then compared and summarized.


Sign in / Sign up

Export Citation Format

Share Document