scholarly journals Grapheme-to-Phoneme Conversion with Convolutional Neural Networks

2019 ◽  
Vol 9 (6) ◽  
pp. 1143 ◽  
Author(s):  
Sevinj Yolchuyeva ◽  
Géza Németh ◽  
Bálint Gyires-Tóth

Grapheme-to-phoneme (G2P) conversion is the process of generating pronunciation for words based on their written form. It has a highly essential role for natural language processing, text-to-speech synthesis and automatic speech recognition systems. In this paper, we investigate convolutional neural networks (CNN) for G2P conversion. We propose a novel CNN-based sequence-to-sequence (seq2seq) architecture for G2P conversion. Our approach includes an end-to-end CNN G2P conversion with residual connections and, furthermore, a model that utilizes a convolutional neural network (with and without residual connections) as encoder and Bi-LSTM as a decoder. We compare our approach with state-of-the-art methods, including Encoder-Decoder LSTM and Encoder-Decoder Bi-LSTM. Training and inference times, phoneme and word error rates were evaluated on the public CMUDict dataset for US English, and the best performing convolutional neural network-based architecture was also evaluated on the NetTalk dataset. Our method approaches the accuracy of previous state-of-the-art results in terms of phoneme error rate.

Mathematics ◽  
2021 ◽  
Vol 9 (2) ◽  
pp. 189
Author(s):  
Feng Liu ◽  
Xuan Zhou ◽  
Xuehu Yan ◽  
Yuliang Lu ◽  
Shudong Wang

Steganalysis is a method to detect whether the objects contain secret messages. With the popularity of deep learning, using convolutional neural networks (CNNs), steganalytic schemes have become the chief method of combating steganography in recent years. However, the diversity of filters has not been fully utilized in the current research. This paper constructs a new effective network with diverse filter modules (DFMs) and squeeze-and-excitation modules (SEMs), which can better capture the embedding artifacts. As the essential parts, combining three different scale convolution filters, DFMs can process information diversely, and the SEMs can enhance the effective channels out from DFMs. The experiments presented that our CNN is effective against content-adaptive steganographic schemes with different payloads, such as S-UNIWARD and WOW algorithms. Moreover, some state-of-the-art methods are compared with our approach to demonstrate the outstanding performance.


2017 ◽  
Vol 17 (5) ◽  
pp. 1110-1128 ◽  
Author(s):  
Deegan J Atha ◽  
Mohammad R Jahanshahi

Corrosion is a major defect in structural systems that has a significant economic impact and can pose safety risks if left untended. Currently, an inspector visually assesses the condition of a structure to identify corrosion. This approach is time-consuming, tedious, and subjective. Robotic systems, such as unmanned aerial vehicles, paired with computer vision algorithms have the potential to perform autonomous damage detection that can significantly decrease inspection time and lead to more frequent and objective inspections. This study evaluates the use of convolutional neural networks for corrosion detection. A convolutional neural network learns the appropriate classification features that in traditional algorithms were hand-engineered. Eliminating the need for dependence on prior knowledge and human effort in designing features is a major advantage of convolutional neural networks. This article presents different convolutional neural network–based approaches for corrosion assessment on metallic surfaces. The effect of different color spaces, sliding window sizes, and convolutional neural network architectures are discussed. To this end, the performance of two pretrained state-of-the-art convolutional neural network architectures as well as two proposed convolutional neural network architectures are evaluated, and it is shown that convolutional neural networks outperform state-of-the-art vision-based corrosion detection approaches that are developed based on texture and color analysis using a simple multilayered perceptron network. Furthermore, it is shown that one of the proposed convolutional neural networks significantly improves the computational time in contrast with state-of-the-art pretrained convolutional neural networks while maintaining comparable performance for corrosion detection.


Mathematics ◽  
2020 ◽  
Vol 8 (6) ◽  
pp. 936 ◽  
Author(s):  
Nebojsa Bacanin ◽  
Timea Bezdan ◽  
Eva Tuba ◽  
Ivana Strumberger ◽  
Milan Tuba

Convolutional neural networks have a broad spectrum of practical applications in computer vision. Currently, much of the data come from images, and it is crucial to have an efficient technique for processing these large amounts of data. Convolutional neural networks have proven to be very successful in tackling image processing tasks. However, the design of a network structure for a given problem entails a fine-tuning of the hyperparameters in order to achieve better accuracy. This process takes much time and requires effort and expertise from the domain. Designing convolutional neural networks’ architecture represents a typical NP-hard optimization problem, and some frameworks for generating network structures for a specific image classification tasks have been proposed. To address this issue, in this paper, we propose the hybridized monarch butterfly optimization algorithm. Based on the observed deficiencies of the original monarch butterfly optimization approach, we performed hybridization with two other state-of-the-art swarm intelligence algorithms. The proposed hybrid algorithm was firstly tested on a set of standard unconstrained benchmark instances, and later on, it was adapted for a convolutional neural network design problem. Comparative analysis with other state-of-the-art methods and algorithms, as well as with the original monarch butterfly optimization implementation was performed for both groups of simulations. Experimental results proved that our proposed method managed to obtain higher classification accuracy than other approaches, the results of which were published in the modern computer science literature.


2021 ◽  
Author(s):  
Richardson Santiago Teles Menezes ◽  
Angelo Marcelino Cordeiro ◽  
Rafael Magalhães ◽  
Helton Maia

In this paper, state-of-the-art architectures of Convolutional Neural Networks (CNNs) are explained and compared concerning authorship classification of famous paintings. The chosen CNNs architectures were VGG-16, VGG-19, Residual Neural Networks (ResNet), and Xception. The used dataset is available on the website Kaggle, under the title “Best Artworks of All Time”. Weighted classes for each artist with more than 200 paintings present in the dataset were created to represent and classify each artist’s style. The performed experiments resulted in an accuracy of up to 95% for the Xception architecture with an average F1-score of 0.87, 92% of accuracy with an average F1-score of 0.83 for the ResNet in its 50-layer configuration, while both of the VGG architectures did not present satisfactory results for the same amount of epochs, achieving at most 60% of accuracy.


Author(s):  
Tushar Goyal

Image recognition plays a foundational role in the field of computer vision and there has been extensive research to develop state-of-the-art techniques especially using Convolutional Neural Network (CNN). This paper aims to study some CNNs, heavily inspired by highly popular state-of-the-art CNNs, designed from scratch specifically for the Cifar-10 dataset and present a fair comparison between them.


Author(s):  
Sachin B. Jadhav

<span lang="EN-US">Plant pathologists desire soft computing technology for accurate and reliable diagnosis of plant diseases. In this study, we propose an efficient soybean disease identification method based on a transfer learning approach by using a pre-trained convolutional neural network (CNN’s) such as AlexNet, GoogleNet, VGG16, ResNet101, and DensNet201. The proposed convolutional neural networks were trained using 1200 plant village image dataset of diseased and healthy soybean leaves, to identify three soybean diseases out of healthy leaves. Pre-trained CNN used to enable a fast and easy system implementation in practice. We used the five-fold cross-validation strategy to analyze the performance of networks. In this study, we used a pre-trained convolutional neural network as feature extractors and classifiers. The experimental results based on the proposed approach using pre-trained AlexNet, GoogleNet, VGG16, ResNet101, and DensNet201 networks achieve an accuracy of 95%, 96.4 %, 96.4 %, 92.1%, 93.6% respectively. The experimental results for the identification of soybean diseases indicated that the proposed networks model achieves the highest accuracy</span>


2021 ◽  
Author(s):  
Shima Baniadamdizaj ◽  
Mohammadreza Soheili ◽  
Azadeh Mansouri

Abstract Today integration of facts from virtual and paper files may be very vital for the expertise control of efficient. This calls for the record to be localized at the photograph. Several strategies had been proposed to resolve this trouble; however, they may be primarily based totally on conventional photograph processing strategies that aren't sturdy to intense viewpoints and backgrounds. Deep Convolutional Neural Networks (CNNs), on the opposite hand, have demonstrated to be extraordinarily sturdy to versions in history and viewing attitude for item detection and classification responsibilities. We endorse new utilization of Neural Networks (NNs) for the localization trouble as a localization trouble. The proposed technique ought to even localize photos that don't have a very square shape. Also, we used a newly accrued dataset that has extra tough responsibilities internal and is in the direction of a slipshod user. The end result knowledgeable in 3 exclusive classes of photos and our proposed technique has 83% on average. The end result is as compared with the maximum famous record localization strategies and cell applications.


2018 ◽  
Vol 7 (3.1) ◽  
pp. 13
Author(s):  
Raveendra K ◽  
R Vinoth Kanna

Automatic logo based document image retrieval process is an essential and mostly used method in the feature extraction applications. In this paper the architecture of Convolutional Neural Network (CNN) was elaborately explained with pictorial representations in order to understand the complex Convolutional Neural Networks process in a simplified way. The main objective of this paper is to effectively utilize the CNN in the process of automatic logo based document image retrieval methods.  


2021 ◽  
Vol 2089 (1) ◽  
pp. 012013
Author(s):  
Priyadarshini Chatterjee ◽  
Dutta Sushama Rani

Abstract Automated diagnosis of diseases in the recent years have gain lots of advantages and potential. Specially automated screening of cancers has helped the clinicians over the time. Sometimes it is seen that the diagnosis of the clinicians is biased but automated detection can help them to come to a proper conclusion. Automated screening is implemented using either artificial inter connected system or convolutional inter connected system. As Artificial neural network is slow in computation, so Convolutional Neural Network has achieved lots of importance in the recent years. It is also seen that Convolutional Neural Network architecture requires a smaller number of datasets. This also provides them an edge over Artificial Neural Networks. Convolutional Neural Networks is used for both segmentation and classification. Image dissection is one of the important steps in the model used for any kind of image analysis. This paper surveys various such Convolutional Neural Networks that are used for medical image analysis.


2021 ◽  
Vol 5 (2) ◽  
pp. 312-318
Author(s):  
Rima Dias Ramadhani ◽  
Afandi Nur Aziz Thohari ◽  
Condro Kartiko ◽  
Apri Junaidi ◽  
Tri Ginanjar Laksana ◽  
...  

Waste is goods / materials that have no value in the scope of production, where in some cases the waste is disposed of carelessly and can damage the environment. The Indonesian government in 2019 recorded waste reaching 66-67 million tons, which is higher than the previous year, which was 64 million tons. Waste is differentiated based on its type, namely organic and anorganic waste. In the field of computer science, the process of sensing the type waste can be done using a camera and the Convolutional Neural Networks (CNN) method, which is a type of neural network that works by receiving input in the form of images. The input will be trained using CNN architecture so that it will produce output that can recognize the object being inputted. This study optimizes the use of the CNN method to obtain accurate results in identifying types of waste. Optimization is done by adding several hyperparameters to the CNN architecture. By adding hyperparameters, the accuracy value is 91.2%. Meanwhile, if the hyperparameter is not used, the accuracy value is only 67.6%. There are three hyperparameters used to increase the accuracy value of the model. They are dropout, padding, and stride. 20% increase in dropout to increase training overfit. Whereas padding and stride are used to speed up the model training process.


Sign in / Sign up

Export Citation Format

Share Document