Darknet on OpenCL: A Multi-platform Tool for Object Detection and Classification

Mapping Intimacies ◽

10.20944/preprints202007.0506.v1 ◽

2020 ◽

Author(s):

Piotr Sowa ◽

Jacek Izydorczyk

Keyword(s):

Neural Networks ◽

Gpu Computing ◽

State Of The Art ◽

Computing Time ◽

Lessons Learned ◽

Memory Transfer ◽

Training Performance ◽

Weak Points ◽

And Training

The article’s goal is to overview challenges and problems on the way from the state of the art CUDA accelerated neural networks code to multi-GPU code. For this purpose, the authors describe the journey of porting the existing in the GitHub, fully-featured CUDA accelerated Darknet engine to OpenCL. The article presents lessons learned and the techniques that were put in place to make this port happen. There are few other implementations on the GitHub that leverage the OpenCL standard, and a few have tried to port Darknet as well. Darknet is a well known convolutional neural network (CNN) framework. The authors of this article investigated all aspects of the porting and achieved the fully-featured Darknet engine on OpenCL. The effort was focused not only on the classification with the use of YOLO1, YOLO2, and YOLO3 CNN models. They also covered other aspects, such as training neural networks, and benchmarks to look for the weak points in the implementation. The GPU computing code substantially improves Darknet computing time compared to the standard CPU version by using underused hardware in existing systems. If the system is OpenCL-based, then it is practically hardware independent. In this article, the authors report comparisons of the computation and training performance compared to the existing CUDA-based Darknet engine in the various computers, including single board computers, and, different CNN use-cases. The authors found that the OpenCL version could perform as fast as the CUDA version in the compute aspect, but it is slower in memory transfer between RAM (CPU memory) and VRAM (GPU memory). It depends on the quality of OpenCL implementation only. Moreover, loosening hardware requirements by the OpenCL Darknet can boost applications of DNN, especially in the energy-sensitive applications of Artificial Intelligence (AI) and Machine Learning (ML).

Download Full-text

Label Distribution for Learning with Noisy Labels

Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2020/356 ◽

2020 ◽

Author(s):

Yun-Peng Liu ◽

Ning Xu ◽

Yu Zhang ◽

Xin Geng

Keyword(s):

Neural Networks ◽

Deep Neural Networks ◽

Learning Algorithm ◽

State Of The Art ◽

Confidence Estimation ◽

Novel Method ◽

Real World Datasets ◽

Label Distribution ◽

Noisy Labels

The performances of deep neural networks (DNNs) crucially rely on the quality of labeling. In some situations, labels are easily corrupted, and therefore some labels become noisy labels. Thus, designing algorithms that deal with noisy labels is of great importance for learning robust DNNs. However, it is difficult to distinguish between clean labels and noisy labels, which becomes the bottleneck of many methods. To address the problem, this paper proposes a novel method named Label Distribution based Confidence Estimation (LDCE). LDCE estimates the confidence of the observed labels based on label distribution. Then, the boundary between clean labels and noisy labels becomes clear according to confidence scores. To verify the effectiveness of the method, LDCE is combined with the existing learning algorithm to train robust DNNs. Experiments on both synthetic and real-world datasets substantiate the superiority of the proposed algorithm against state-of-the-art methods.

Download Full-text

An End-to-End Visual-Audio Attention Network for Emotion Recognition in User-Generated Videos

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v34i01.5364 ◽

2020 ◽

Vol 34 (01) ◽

pp. 303-311 ◽

Cited By ~ 3

Author(s):

Sicheng Zhao ◽

Yunsheng Ma ◽

Yang Gu ◽

Jufeng Yang ◽

Tengfei Xing ◽

...

Keyword(s):

Neural Networks ◽

Emotion Recognition ◽

State Of The Art ◽

Source Code ◽

Cross Entropy ◽

Attention Network ◽

Audio Features ◽

End To End ◽

3D Cnn ◽

And Training

Emotion recognition in user-generated videos plays an important role in human-centered computing. Existing methods mainly employ traditional two-stage shallow pipeline, i.e. extracting visual and/or audio features and training classifiers. In this paper, we propose to recognize video emotions in an end-to-end manner based on convolutional neural networks (CNNs). Specifically, we develop a deep Visual-Audio Attention Network (VAANet), a novel architecture that integrates spatial, channel-wise, and temporal attentions into a visual 3D CNN and temporal attentions into an audio 2D CNN. Further, we design a special classification loss, i.e. polarity-consistent cross-entropy loss, based on the polarity-emotion hierarchy constraint to guide the attention generation. Extensive experiments conducted on the challenging VideoEmotion-8 and Ekman-6 datasets demonstrate that the proposed VAANet outperforms the state-of-the-art approaches for video emotion recognition. Our source code is released at: https://github.com/maysonma/VAANet.

Download Full-text

Empirical Investigation of Optimization Algorithms in Neural Machine Translation

Prague Bulletin of Mathematical Linguistics ◽

10.1515/pralin-2017-0005 ◽

2017 ◽

Vol 108 (1) ◽

pp. 13-25 ◽

Cited By ~ 2

Author(s):

Parnia Bahar ◽

Tamer Alkhouli ◽

Jan-Thorsten Peter ◽

Christopher Jan-Steffen Brix ◽

Hermann Ney

Keyword(s):

Neural Networks ◽

Machine Translation ◽

Optimization Problem ◽

Empirical Investigation ◽

State Of The Art ◽

Optimization Techniques ◽

Neural Machine Translation ◽

Translation Quality ◽

And Training ◽

Dimensional Optimization

AbstractTraining neural networks is a non-convex and a high-dimensional optimization problem. In this paper, we provide a comparative study of the most popular stochastic optimization techniques used to train neural networks. We evaluate the methods in terms of convergence speed, translation quality, and training stability. In addition, we investigate combinations that seek to improve optimization in terms of these aspects. We train state-of-the-art attention-based models and apply them to perform neural machine translation. We demonstrate our results on two tasks: WMT 2016 En→Ro and WMT 2015 De→En.

Download Full-text

OPTIMAL UNIFORM QUANTIZATION OF PARAMETERS OF CONVOLUTIONAL NEURAL NETWORKS

Issues of radio electronics ◽

10.21778/2218-5453-2018-8-99-103 ◽

2018 ◽

pp. 99-103

Author(s):

D. S. Kolesnikov ◽

D. A. Kuznetsov

Keyword(s):

Neural Networks ◽

Convolutional Neural Networks ◽

Mobile Applications ◽

Recognition Accuracy ◽

State Of The Art ◽

Network Parameters ◽

Wide Range ◽

Uniform Quantization ◽

Adaptive Step

State of the art convolutional neural networks provide high accuracy in solving a wide range of problems. Usually it is achieved by a significant increasing their computational complexity and the representation of the network parameters in single-precision floating point numbers. However, due to the limited resources, the application of networks in embedded systems and mobile applications in real time is problematic. One of the methods to solve this problem is to reduce the bit depth of data and use integer arithmetic. For this purpose, the network parameters are quantized. Performing quantization, it is necessary to ensure a minimum loss of recognition accuracy. The article proposes to use an optimal uniform quantizer with an adaptive step. The quantizer step depends on the distribution function of the quantized parameters. It reduces the effect of the quantization error on the recognition accuracy. There are also described approaches to improving the quality of quantization. The proposed quantization method is estimated on the CIFAR-10 database. It is shown that the optimal uniform quantizer for CIFAR-10 database with 8-bit representation of network parameters allows to achieve the accuracy of the initial trained network.

Download Full-text

Innovative Topologies and Algorithms for Neural Networks

Future Internet ◽

10.3390/fi12070117 ◽

2020 ◽

Vol 12 (7) ◽

pp. 117

Author(s):

Salvatore Graziani ◽

Maria Gabriella Xibilia

Keyword(s):

Neural Networks ◽

Network Architecture ◽

Deep Neural Networks ◽

State Of The Art ◽

Text Processing ◽

Successful Outcome ◽

Training Algorithm ◽

Special Issue ◽

And Training ◽

Selection Of

The introduction of new topologies and training procedures to deep neural networks has solicited a renewed interest in the field of neural computation. The use of deep structures has significantly improved the state of the art in many applications, such as computer vision, speech and text processing, medical applications, and IoT (Internet of Things). The probability of a successful outcome from a neural network is linked to selection of an appropriate network architecture and training algorithm. Accordingly, much of the recent research on neural networks is devoted to the study and proposal of novel architectures, including solutions tailored to specific problems. The papers of this Special Issue make significant contributions to the above-mentioned fields by merging theoretical aspects and relevant applications. Twelve papers are collected in the issue, addressing many relevant aspects of the topic.

Download Full-text

Gameplay, Interactive Drama, and Training: Authoring Edutainment Stories for Online Players (AESOP)

Presence Teleoperators & Virtual Environments ◽

10.1162/pres.16.1.65 ◽

2007 ◽

Vol 16 (1) ◽

pp. 65-83 ◽

Cited By ~ 1

Author(s):

Barry G Silverman ◽

Michael Johns ◽

Ransom Weaver ◽

Josh Mosley

Keyword(s):

State Of The Art ◽

Educational Games ◽

Lessons Learned ◽

Full Potential ◽

Technological Advances ◽

Interactive Drama ◽

Game Engines ◽

And Training ◽

Accessible Format

This paper describes initial efforts at providing some of the technological advances of the videogame genres in a coherent, accessible format to teams of educators. By providing these capabilities inside an interactive drama generator, we believe that the full potential of educational games may eventually be realized. Sections 1 and 2 postulate three goals for reaching that objective: a toolset for interactive drama authoring, ways to insulate authors from game engines, and reusable digital casts to facilitate composability. Sections 3 and 4 present progress on those tools and an in-depth case study that made use of the resulting toolset to create a large interactive drama. We close with lessons learned to date and a look at the remaining challenges: the unpleasant reality that state-of-the-art tools are not yet able to boost the productivity of edutainment authors.

Download Full-text

Androgen Receptor Binding Category Prediction with Deep Neural Networks and Structure-, Ligand-, and Statistically Based Features

Molecules ◽

10.3390/molecules26051285 ◽

2021 ◽

Vol 26 (5) ◽

pp. 1285

Author(s):

Alfonso T. García-Sosa

Keyword(s):

Neural Networks ◽

Androgen Receptor ◽

Logistic Model ◽

Deep Neural Networks ◽

State Of The Art ◽

Protein Structures ◽

Training Set ◽

Multivariate Logistic Model ◽

And Training ◽

Better Than

Substances that can modify the androgen receptor pathway in humans and animals are entering the environment and food chain with the proven ability to disrupt hormonal systems and leading to toxicity and adverse effects on reproduction, brain development, and prostate cancer, among others. State-of-the-art databases with experimental data of human, chimp, and rat effects by chemicals have been used to build machine-learning classifiers and regressors and to evaluate these on independent sets. Different featurizations, algorithms, and protein structures lead to different results, with deep neural networks (DNNs) on user-defined physicochemically relevant features developed for this work outperforming graph convolutional, random forest, and large featurizations. The results show that these user-provided structure-, ligand-, and statistically based features and specific DNNs provided the best results as determined by AUC (0.87), MCC (0.47), and other metrics and by their interpretability and chemical meaning of the descriptors/features. In addition, the same features in the DNN method performed better than in a multivariate logistic model: validation MCC = 0.468 and training MCC = 0.868 for the present work compared to evaluation set MCC = 0.2036 and training set MCC = 0.5364 for the multivariate logistic regression on the full, unbalanced set. Techniques of this type may improve AR and toxicity description and prediction, improving assessment and design of compounds. Source code and data are available on github.

Download Full-text

Evaluation of scratch and pre-trained convolutional neural networks for the classification of Tomato plant diseases

IAES International Journal of Artificial Intelligence (IJ-AI) ◽

10.11591/ijai.v10.i2.pp467-475 ◽

2021 ◽

Vol 10 (2) ◽

pp. 467

Author(s):

Mohammad Amimul Ihsan Aquil ◽

Wan Hussain Wan Ishak

Keyword(s):

Neural Networks ◽

Convolutional Neural Networks ◽

Tomato Plant ◽

State Of The Art ◽

Plant Diseases ◽

Fine Tuning ◽

Convolutional Network ◽

Tomato Diseases

<span id="docs-internal-guid-01580d49-7fff-6f2a-70d1-7893ec0a6e14"><span>Plant diseases are a major cause of destruction and death of most plants and especially trees. However, with the help of early detection, this issue can be solved and treated appropriately. A timely and accurate diagnosis is critical in maintaining the quality of crops. Recent innovations in the field of deep learning (DL), especially in convolutional neural networks (CNNs) have achieved great breakthroughs across different applications such as the classification of plant diseases. This study aims to evaluate scratch and pre-trained CNNs in the classification of tomato plant diseases by comparing some of the state-of-the-art architectures including densely connected convolutional network (Densenet) 120, residual network (ResNet) 101, ResNet 50, ReseNet 30, ResNet 18, squeezenet and Vgg.net. The comparison was then evaluated using a multiclass statistical analysis based on the F-Score, specificity, sensitivity, precision, and accuracy. The dataset used for the experiments was drawn from 9 classes of tomato diseases and a healthy class from PlantVillage. The findings show that the pretrained Densenet-120 performed excellently with 99.68% precision, 99.84% F-1 score, and 99.81% accuracy, which is higher compared to its non-trained based model showing the effectiveness of using a combination of a CNN model with fine-tuning adjustment in classifying crop diseases.</span></span>

Download Full-text

Corpus and Models for Lemmatisation and POS-tagging of Classical French Theatre

Journal of Data Mining & Digital Humanities ◽

10.46298/jdmdh.6485 ◽

2021 ◽

Vol 2021 (Digital humanities in...) ◽

Author(s):

Jean-Baptiste Camps ◽

Simon Gabay ◽

Paul Fièvre ◽

Thibault Clérice ◽

Florian Cafiero

Keyword(s):

Neural Networks ◽

French Literature ◽

State Of The Art ◽

Preliminary Step ◽

Training Models ◽

Pos Tagging ◽

Current State ◽

French Theatre ◽

And Training

This paper describes the process of building an annotated corpus and training models for classical French literature, with a focus on theatre, and particularly comedies in verse. It was originally developed as a preliminary step to the stylometric analyses presented in Cafiero and Camps [2019]. The use of a recent lemmatiser based on neural networks and a CRF tagger allows to achieve accuracies beyond the current state-of-the art on the in-domain test, and proves to be robust during out-of-domain tests, i.e.up to 20th c.novels.

Download Full-text

Neural Belief Reasoner

Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2020/590 ◽

2020 ◽

Author(s):

Haifeng Qian

Keyword(s):

Neural Networks ◽

State Of The Art ◽

Belief Function ◽

Learning Task ◽

Natural Images ◽

Training Algorithms ◽

Conflicting Information ◽

Set Operations ◽

Adversarial Training ◽

And Training

This paper proposes a new generative model called neural belief reasoner (NBR). It differs from previous models in that it specifies a belief function rather than a probability distribution. Its implementation consists of neural networks, fuzzy-set operations and belief-function operations, and query-answering, sample-generation and training algorithms are presented. This paper studies NBR in two tasks. The first is a synthetic unsupervised-learning task, which demonstrates NBR's ability to perform multi-hop reasoning, reasoning with uncertainty and reasoning about conflicting information. The second is supervised learning: a robust MNIST classifier for 4 and 9, which is the most challenging pair of digits. This classifier needs no adversarial training, and it substantially exceeds the state of the art in adversarial robustness as measured by the L2 metric, while at the same time maintains 99.1% accuracy on natural images.

Download Full-text