Spoken Language Identification Using Deep Learning

The process of detecting language from an audio clip by an unknown speaker, regardless of gender, manner of speaking, and distinct age speaker, is defined as spoken language identification (SLID). The considerable task is to recognize the features that can distinguish between languages clearly and efficiently. The model uses audio files and converts those files into spectrogram images. It applies the convolutional neural network (CNN) to bring out main attributes or features to detect output easily. The main objective is to detect languages out of English, French, Spanish, and German, Estonian, Tamil, Mandarin, Turkish, Chinese, Arabic, Hindi, Indonesian, Portuguese, Japanese, Latin, Dutch, Portuguese, Pushto, Romanian, Korean, Russian, Swedish, Tamil, Thai, and Urdu. An experiment was conducted on different audio files using the Kaggle dataset named spoken language identification. These audio files are comprised of utterances, each of them spanning over a fixed duration of 10 seconds. The whole dataset is split into training and test sets. Preparatory results give an overall accuracy of 98%. Extensive and accurate testing show an overall accuracy of 88%.

Download Full-text

Deep Learning Based Indian Sign Language Words Identification System

10.3233/apc210272 ◽

2021 ◽

Author(s):

P. Golda Jeyasheeli ◽

N. Indumathi

Keyword(s):

Neural Network ◽

Deep Learning ◽

Convolutional Neural Network ◽

Sign Language ◽

Language Identification ◽

Identification System ◽

Test Set ◽

The People ◽

Indian Sign Language ◽

Smart Wearable

In Indian Population there is about 1 percent of the people are deaf and dumb. Deaf and dumb people use gestures to interact with each other. Ordinary humans fail to grasp the significance of gestures, which makes interaction between deaf and mute people hard. In attempt for ordinary citizens to understand the signs, an automated sign language identification system is proposed. A smart wearable hand device is designed by attaching different sensors to the gloves to perform the gestures. Each gesture has unique sensor values and those values are collected as an excel data. The characteristics of movements are extracted and categorized with the aid of a convolutional neural network (CNN). The data from the test set is identified by the CNN according to the classification. The objective of this system is to bridge the interaction gap between people who are deaf or hard of hearing and the rest of society.

Download Full-text

Spoken Language Identification with Deep Convolutional Neural Network and Data Augmentation

2020 28th Signal Processing and Communications Applications Conference (SIU) ◽

10.1109/siu49456.2020.9302425 ◽

2020 ◽

Keyword(s):

Neural Network ◽

Convolutional Neural Network ◽

Data Augmentation ◽

Spoken Language ◽

Language Identification ◽

Deep Convolutional Neural Network

Download Full-text

Covid Classification Using Audio Data

International Journal for Research in Applied Science and Engineering Technology ◽

10.22214/ijraset.2021.38675 ◽

2021 ◽

Vol 9 (10) ◽

pp. 1633-1637

Author(s):

Adwait Patil

Keyword(s):

Neural Network ◽

Deep Learning ◽

Convolutional Neural Network ◽

Binary Classification ◽

Image Data ◽

Audio Classification ◽

Mel Frequency Cepstral Coefficients ◽

Audio Data ◽

Cepstral Coefficients ◽

Audio Files

Abstract: Coronavirus outbreak has affected the entire world adversely this project has been developed in order to help common masses diagnose their chances of been covid positive just by using coughing sound and basic patient data. Audio classification is one of the most interesting applications of deep learning. Similar to image data audio data is also stored in form of bits and to understand and analyze this audio data we have used Mel frequency cepstral coefficients (MFCCs) which makes it possible to feed the audio to our neural network. In this project we have used Coughvid a crowdsource dataset consisting of 27000 audio files and metadata of same amount of patients. In this project we have used a 1D Convolutional Neural Network (CNN) to process the audio and metadata. Future scope for this project will be a model that rates how likely it is that a person is infected instead of binary classification. Keywords: Audio classification, Mel frequency cepstral coefficients, Convolutional neural network, deep learning, Coughvid

Download Full-text

Convolutional Neural Network of Atomic Surface Structures to Predict Binding Energies for High-Throughput Screening of Catalysts

10.26434/chemrxiv.8150666.v1 ◽

2019 ◽

Author(s):

Seoin Back ◽

Junwoong Yoon ◽

Nianhan Tian ◽

Wen Zhong ◽

Kevin Tran ◽

...

Keyword(s):

Neural Network ◽

Deep Learning ◽

Convolutional Neural Network ◽

High Throughput ◽

High Throughput Screening ◽

Binding Energies ◽

Surface Structures ◽

Voronoi Polyhedra ◽

Atomic Surface

We present an application of deep-learning convolutional neural network of atomic surface structures using atomic and Voronoi polyhedra-based neighbor information to predict adsorbate binding energies for the application in catalysis.

Download Full-text

A Review for Investigation Studies That are Done for Improving Image ProcessingClassification Based on Convolutional Neural Network (CNN) That is Architectural of Deep Learning

International Congress on Human-Computer Interaction, Optimization and Robotic Applications Proceedings ◽

10.36287/setsci.4.5.007 ◽

2019 ◽

Author(s):

Mustafa Tüfekçi ◽

Fatih Karpat

Keyword(s):

Neural Network ◽

Deep Learning ◽

Convolutional Neural Network

Download Full-text

Effectiveness of transfer learning for enhancing tumor classification with a convolutional neural network on frozen sections

Scientific Reports ◽

10.1038/s41598-020-78129-0 ◽

2020 ◽

Vol 10 (1) ◽

Author(s):

Young-Gon Kim ◽

Sungchul Kim ◽

Cristina Eunbee Cho ◽

In Hye Song ◽

Hee Jin Lee ◽

...

Keyword(s):

Neural Network ◽

Deep Learning ◽

Convolutional Neural Network ◽

Transfer Learning ◽

Frozen Section ◽

Medical Center ◽

External Validation ◽

Model Performance ◽

Classification Model ◽

Training Dataset

AbstractFast and accurate confirmation of metastasis on the frozen tissue section of intraoperative sentinel lymph node biopsy is an essential tool for critical surgical decisions. However, accurate diagnosis by pathologists is difficult within the time limitations. Training a robust and accurate deep learning model is also difficult owing to the limited number of frozen datasets with high quality labels. To overcome these issues, we validated the effectiveness of transfer learning from CAMELYON16 to improve performance of the convolutional neural network (CNN)-based classification model on our frozen dataset (N = 297) from Asan Medical Center (AMC). Among the 297 whole slide images (WSIs), 157 and 40 WSIs were used to train deep learning models with different dataset ratios at 2, 4, 8, 20, 40, and 100%. The remaining, i.e., 100 WSIs, were used to validate model performance in terms of patch- and slide-level classification. An additional 228 WSIs from Seoul National University Bundang Hospital (SNUBH) were used as an external validation. Three initial weights, i.e., scratch-based (random initialization), ImageNet-based, and CAMELYON16-based models were used to validate their effectiveness in external validation. In the patch-level classification results on the AMC dataset, CAMELYON16-based models trained with a small dataset (up to 40%, i.e., 62 WSIs) showed a significantly higher area under the curve (AUC) of 0.929 than those of the scratch- and ImageNet-based models at 0.897 and 0.919, respectively, while CAMELYON16-based and ImageNet-based models trained with 100% of the training dataset showed comparable AUCs at 0.944 and 0.943, respectively. For the external validation, CAMELYON16-based models showed higher AUCs than those of the scratch- and ImageNet-based models. Model performance for slide feasibility of the transfer learning to enhance model performance was validated in the case of frozen section datasets with limited numbers.

Download Full-text

Matching Large Baseline Oblique Stereo Images Using an End-to-End Convolutional Neural Network

Remote Sensing ◽

10.3390/rs13020274 ◽

2021 ◽

Vol 13 (2) ◽

pp. 274

Author(s):

Guobiao Yao ◽

Alper Yilmaz ◽

Li Zhang ◽

Fei Meng ◽

Haibin Ai ◽

...

Keyword(s):

Neural Network ◽

Deep Learning ◽

Convolutional Neural Network ◽

Stereo Matching ◽

Least Square ◽

Affine Invariant ◽

Stereo Images ◽

Distance Ratio ◽

Matching Algorithm ◽

End To End

The available stereo matching algorithms produce large number of false positive matches or only produce a few true-positives across oblique stereo images with large baseline. This undesired result happens due to the complex perspective deformation and radiometric distortion across the images. To address this problem, we propose a novel affine invariant feature matching algorithm with subpixel accuracy based on an end-to-end convolutional neural network (CNN). In our method, we adopt and modify a Hessian affine network, which we refer to as IHesAffNet, to obtain affine invariant Hessian regions using deep learning framework. To improve the correlation between corresponding features, we introduce an empirical weighted loss function (EWLF) based on the negative samples using K nearest neighbors, and then generate deep learning-based descriptors with high discrimination that is realized with our multiple hard network structure (MTHardNets). Following this step, the conjugate features are produced by using the Euclidean distance ratio as the matching metric, and the accuracy of matches are optimized through the deep learning transform based least square matching (DLT-LSM). Finally, experiments on Large baseline oblique stereo images acquired by ground close-range and unmanned aerial vehicle (UAV) verify the effectiveness of the proposed approach, and comprehensive comparisons demonstrate that our matching algorithm outperforms the state-of-art methods in terms of accuracy, distribution and correct ratio. The main contributions of this article are: (i) our proposed MTHardNets can generate high quality descriptors; and (ii) the IHesAffNet can produce substantial affine invariant corresponding features with reliable transform parameters.

Download Full-text

Research on optimization of deep learning algorithm based on convolutional neural network

Journal of Physics Conference Series ◽

10.1088/1742-6596/1848/1/012038 ◽

2021 ◽

Vol 1848 (1) ◽

pp. 012038

Author(s):

Luo Yiyue ◽

Fan Yu ◽

Chen Xianjun

Keyword(s):

Neural Network ◽

Deep Learning ◽

Convolutional Neural Network ◽

Learning Algorithm ◽

Deep Learning Algorithm

Download Full-text

Deep Learning Algorithm Trained with COVID-19 Pneumonia Also Identifies Immune Checkpoint Inhibitor Therapy-Related Pneumonitis

Cancers ◽

10.3390/cancers13040652 ◽

2021 ◽

Vol 13 (4) ◽

pp. 652 ◽

Cited By ~ 2

Author(s):

Carlo Augusto Mallio ◽

Andrea Napolitano ◽

Gennaro Castiello ◽

Francesco Maria Giordano ◽

Pasquale D'Alessio ◽

...

Keyword(s):

Neural Network ◽

Computed Tomography ◽

Deep Learning ◽

Convolutional Neural Network ◽

Immune Checkpoint ◽

Immune Checkpoint Inhibitor ◽

Learning Algorithm ◽

Checkpoint Inhibitor ◽

Deep Convolutional Neural Network ◽

Deep Learning Algorithm

Background: Coronavirus disease 2019 (COVID-19) pneumonia and immune checkpoint inhibitor (ICI) therapy-related pneumonitis share common features. The aim of this study was to determine on chest computed tomography (CT) images whether a deep convolutional neural network algorithm is able to solve the challenge of differential diagnosis between COVID-19 pneumonia and ICI therapy-related pneumonitis. Methods: We enrolled three groups: a pneumonia-free group (n = 30), a COVID-19 group (n = 34), and a group of patients with ICI therapy-related pneumonitis (n = 21). Computed tomography images were analyzed with an artificial intelligence (AI) algorithm based on a deep convolutional neural network structure. Statistical analysis included the Mann–Whitney U test (significance threshold at p < 0.05) and the receiver operating characteristic curve (ROC curve). Results: The algorithm showed low specificity in distinguishing COVID-19 from ICI therapy-related pneumonitis (sensitivity 97.1%, specificity 14.3%, area under the curve (AUC) = 0.62). ICI therapy-related pneumonitis was identified by the AI when compared to pneumonia-free controls (sensitivity = 85.7%, specificity 100%, AUC = 0.97). Conclusions: The deep learning algorithm is not able to distinguish between COVID-19 pneumonia and ICI therapy-related pneumonitis. Awareness must be increased among clinicians about imaging similarities between COVID-19 and ICI therapy-related pneumonitis. ICI therapy-related pneumonitis can be applied as a challenge population for cross-validation to test the robustness of AI models used to analyze interstitial pneumonias of variable etiology.

Download Full-text

Multiclass Spoken Language Identification for Indian Languages using Deep Learning

2020 IEEE Bombay Section Signature Conference (IBSSC) ◽

10.1109/ibssc51096.2020.9332161 ◽

2020 ◽

Author(s):

Lakshmana Rao Arla ◽

Sridevi Bonthu ◽

Abhinav Dayal

Keyword(s):

Deep Learning ◽

Spoken Language ◽

Language Identification ◽

Indian Languages

Download Full-text