Neural Arabic Text Diacritization: State-of-the-Art Results and a Novel Approach for Arabic NLP Downstream Tasks

Author(s):  
Ali Fadel ◽  
Ibraheem Tuffaha ◽  
Mahmoud Al-Ayyoub

In this work, we present several deep learning models for the automatic diacritization of Arabic text. Our models are built using two main approaches, viz. Feed-Forward Neural Network (FFNN) and Recurrent Neural Network (RNN), with several enhancements such as 100-hot encoding, embeddings, Conditional Random Field (CRF), and Block-Normalized Gradient (BNG). The models are tested on the only freely available benchmark dataset and the results show that our models are either better or on par with other models even those requiring human-crafted language-dependent post-processing steps, unlike ours. Moreover, we show how diacritics in Arabic can be used to enhance the models of downstream NLP tasks such as Machine Translation (MT) and Sentiment Analysis (SA) by proposing novel Translation over Diacritization (ToD) and Sentiment over Diacritization (SoD) approaches.

2020 ◽  
Vol 34 (03) ◽  
pp. 2594-2601
Author(s):  
Arjun Akula ◽  
Shuai Wang ◽  
Song-Chun Zhu

We present CoCoX (short for Conceptual and Counterfactual Explanations), a model for explaining decisions made by a deep convolutional neural network (CNN). In Cognitive Psychology, the factors (or semantic-level features) that humans zoom in on when they imagine an alternative to a model prediction are often referred to as fault-lines. Motivated by this, our CoCoX model explains decisions made by a CNN using fault-lines. Specifically, given an input image I for which a CNN classification model M predicts class cpred, our fault-line based explanation identifies the minimal semantic-level features (e.g., stripes on zebra, pointed ears of dog), referred to as explainable concepts, that need to be added to or deleted from I in order to alter the classification category of I by M to another specified class calt. We argue that, due to the conceptual and counterfactual nature of fault-lines, our CoCoX explanations are practical and more natural for both expert and non-expert users to understand the internal workings of complex deep learning models. Extensive quantitative and qualitative experiments verify our hypotheses, showing that CoCoX significantly outperforms the state-of-the-art explainable AI models. Our implementation is available at https://github.com/arjunakula/CoCoX


2021 ◽  
Author(s):  
Noor Ahmad ◽  
Muhammad Aminu ◽  
Mohd Halim Mohd Noor

Deep learning approaches have attracted a lot of attention in the automatic detection of Covid-19 and transfer learning is the most common approach. However, majority of the pre-trained models are trained on color images, which can cause inefficiencies when fine-tuning the models on Covid-19 images which are often grayscale. To address this issue, we propose a deep learning architecture called CovidNet which requires a relatively smaller number of parameters. CovidNet accepts grayscale images as inputs and is suitable for training with limited training dataset. Experimental results show that CovidNet outperforms other state-of-the-art deep learning models for Covid-19 detection.


Sensors ◽  
2019 ◽  
Vol 19 (5) ◽  
pp. 1227 ◽  
Author(s):  
Jincheol Lee ◽  
Seungbin Roh ◽  
Johyun Shin ◽  
Keemin Sohn

Space mean speed cannot be directly measured in the field, although it is a basic parameter that is used to evaluate traffic conditions. An end-to-end convolutional neural network (CNN) was adopted to measure the space mean speed based solely on two consecutive road images. However, tagging images with labels (=true space mean speeds) by manually positioning and tracking every vehicle on road images is a formidable task. The present study was focused on naïve animation images provided by a traffic simulator, because these contain perfect information concerning vehicle movement to attain labels. The animation images, however, seem far-removed from actual photos taken in the field. A cycle-consistent adversarial network (CycleGAN) bridged the reality gap by mapping the animation images into seemingly realistic images that could not be distinguished from real photos. A CNN model trained on the synthesized images was tested on real photos that had been manually labeled. The test performance was comparable to those of state-of-the-art motion-capture technologies. The proposed method showed that deep-learning models to measure the space mean speed could be trained without the need for time-consuming manual annotation.


Author(s):  
Yasir Hussain ◽  
Zhiqiu Huang ◽  
Yu Zhou ◽  
Senzhang Wang

In recent years, deep learning models have shown great potential in source code modeling and analysis. Generally, deep learning-based approaches are problem-specific and data-hungry. A challenging issue of these approaches is that they require training from scratch for a different related problem. In this work, we propose a transfer learning-based approach that significantly improves the performance of deep learning-based source code models. In contrast to traditional learning paradigms, transfer learning can transfer the knowledge learned in solving one problem into another related problem. First, we present two recurrent neural network-based models RNN and GRU for the purpose of transfer learning in the domain of source code modeling. Next, via transfer learning, these pre-trained (RNN and GRU) models are used as feature extractors. Then, these extracted features are combined into attention learner for different downstream tasks. The attention learner leverages from the learned knowledge of pre-trained models and fine-tunes them for a specific downstream task. We evaluate the performance of the proposed approach with extensive experiments with the source code suggestion task. The results indicate that the proposed approach outperforms the state-of-the-art models in terms of accuracy, precision, recall and F-measure without training the models from scratch.


2019 ◽  
Vol 9 (11) ◽  
pp. 2347 ◽  
Author(s):  
Hannah Kim ◽  
Young-Seob Jeong

As the number of textual data is exponentially increasing, it becomes more important to develop models to analyze the text data automatically. The texts may contain various labels such as gender, age, country, sentiment, and so forth. Using such labels may bring benefits to some industrial fields, so many studies of text classification have appeared. Recently, the Convolutional Neural Network (CNN) has been adopted for the task of text classification and has shown quite successful results. In this paper, we propose convolutional neural networks for the task of sentiment classification. Through experiments with three well-known datasets, we show that employing consecutive convolutional layers is effective for relatively longer texts, and our networks are better than other state-of-the-art deep learning models.


2004 ◽  
Vol 14 (05) ◽  
pp. 329-335 ◽  
Author(s):  
LIANG TIAN ◽  
AFZEL NOORE

A support vector machine (SVM) modeling approach for short-term load forecasting is proposed. The SVM learning scheme is applied to the power load data, forcing the network to learn the inherent internal temporal property of power load sequence. We also study the performance when other related input variables such as temperature and humidity are considered. The performance of our proposed SVM modeling approach has been tested and compared with feed-forward neural network and cosine radial basis function neural network approaches. Numerical results show that the SVM approach yields better generalization capability and lower prediction error compared to those neural network approaches.


2021 ◽  
Author(s):  
Noor Ahmad ◽  
Muhammad Aminu ◽  
Mohd Halim Mohd Noor

Deep learning approaches have attracted a lot of attention in the automatic detection of Covid-19 and transfer learning is the most common approach. However, majority of the pre-trained models are trained on color images, which can cause inefficiencies when fine-tuning the models on Covid-19 images which are often grayscale. To address this issue, we propose a deep learning architecture called CovidNet which requires a relatively smaller number of parameters. CovidNet accepts grayscale images as inputs and is suitable for training with limited training dataset. Experimental results show that CovidNet outperforms other state-of-the-art deep learning models for Covid-19 detection.


Author(s):  
Yao Ni ◽  
Dandan Song ◽  
Xi Zhang ◽  
Hao Wu ◽  
Lejian Liao

Generative adversarial networks (GANs) have shown impressive results, however, the generator and the discriminator are optimized in finite parameter space which means their performance still need to be improved. In this paper, we propose a novel approach of adversarial training between one generator and an exponential number of critics which are sampled from the original discriminative neural network via dropout. As discrepancy between outputs of different sub-networks of a same sample can measure the consistency of these critics, we encourage the critics to be consistent to real samples and inconsistent to generated samples during training, while the generator is trained to generate consistent samples for different critics. Experimental results demonstrate that our method can obtain state-of-the-art Inception scores of 9.17 and 10.02 on supervised CIFAR-10 and unsupervised STL-10 image generation tasks, respectively, as well as achieve competitive semi-supervised classification results on several benchmarks. Importantly, we demonstrate that our method can maintain stability in training and alleviate mode collapse.


2021 ◽  
Vol 2042 (1) ◽  
pp. 012002
Author(s):  
Roberto Castello ◽  
Alina Walch ◽  
Raphaël Attias ◽  
Riccardo Cadei ◽  
Shasha Jiang ◽  
...  

Abstract The integration of solar technology in the built environment is realized mainly through rooftop-installed panels. In this paper, we leverage state-of-the-art Machine Learning and computer vision techniques applied on overhead images to provide a geo-localization of the available rooftop surfaces for solar panel installation. We further exploit a 3D building database to associate them to the corresponding roof geometries by means of a geospatial post-processing approach. The stand-alone Convolutional Neural Network used to segment suitable rooftop areas reaches an intersection over union of 64% and an accuracy of 93%, while a post-processing step using building database improves the rejection of false positives. The model is applied to a case study area in the canton of Geneva and the results are compared with another recent method used in the literature to derive the realistic available area.


ScienceAsia ◽  
2012 ◽  
Vol 38 (4) ◽  
pp. 386 ◽  
Author(s):  
Kittipong Ekkachai ◽  
Kanokvate Tungpimolrut ◽  
Itthisek Nilkhamhang

Sign in / Sign up

Export Citation Format

Share Document