scholarly journals Synthesis of Neural Network Architecture for Recognition of Sea-Going Ship Images

2020 ◽  
Vol 24 (1) ◽  
pp. 130-143
Author(s):  
D. I. Konarev ◽  
A. A. Gulamov

Purpose of research. The current task is to monitor ships using video surveillance cameras installed along the canal. It is important for information communication support for navigation of the Moscow Canal. The main subtask is direct recognition of ships in an image or video. Implementation of a neural network is perspectively.Methods. Various neural network are described. images of ships are an input data for the network. The learning sample uses CIFAR-10 dataset. The network is built and trained by using Keras and TensorFlow machine learning libraries.Results. Implementation of curving artificial neural networks for problems of image recognition is described. Advantages of such architecture when working with images are also described. The selection of Python language for neural network implementation is justified. The main used libraries of machine learning, such as TensorFlow and Keras are described. An experiment has been conducted to train swirl neural networks with different architectures based on Google collaboratoty service. The effectiveness of different architectures was evaluated as a percentage of correct pattern recognition in the test sample. Conclusions have been drawn about parameters influence of screwing neural network on showing its effectiveness.Conclusion. The network with a single curl layer in each cascade showed insufficient results, so three-stage curls with two and three curl layers in each cascade were used. Feature map extension has the greatest impact on the accuracy of image recognition. The increase in cascades' number has less noticeable effect and the increase in the number of screwdriver layers in each cascade does not always have an increase in the accuracy of the neural network. During the study, a three-frame network with two buckling layers in each cascade and 128 feature maps is defined as an optimal architecture of neural network under described conditions. operability checking of architecture's part under consideration on random images of ships confirmed the correctness of optimal architecture choosing.

IoT ◽  
2021 ◽  
Vol 2 (2) ◽  
pp. 222-235
Author(s):  
Guillaume Coiffier ◽  
Ghouthi Boukli Hacene ◽  
Vincent Gripon

Deep Neural Networks are state-of-the-art in a large number of challenges in machine learning. However, to reach the best performance they require a huge pool of parameters. Indeed, typical deep convolutional architectures present an increasing number of feature maps as we go deeper in the network, whereas spatial resolution of inputs is decreased through downsampling operations. This means that most of the parameters lay in the final layers, while a large portion of the computations are performed by a small fraction of the total parameters in the first layers. In an effort to use every parameter of a network at its maximum, we propose a new convolutional neural network architecture, called ThriftyNet. In ThriftyNet, only one convolutional layer is defined and used recursively, leading to a maximal parameter factorization. In complement, normalization, non-linearities, downsamplings and shortcut ensure sufficient expressivity of the model. ThriftyNet achieves competitive performance on a tiny parameters budget, exceeding 91% accuracy on CIFAR-10 with less than 40 k parameters in total, 74.3% on CIFAR-100 with less than 600 k parameters, and 67.1% On ImageNet ILSVRC 2012 with no more than 4.15 M parameters. However, the proposed method typically requires more computations than existing counterparts.


Author(s):  
А.И. Паршин ◽  
М.Н. Аралов ◽  
В.Ф. Барабанов ◽  
Н.И. Гребенникова

Задача распознавания изображений - одна из самых сложных в машинном обучении, требующая от исследователя как глубоких знаний, так и больших временных и вычислительных ресурсов. В случае использования нелинейных и сложных данных применяются различные архитектуры глубоких нейронных сетей, но при этом сложным вопросом остается проблема выбора нейронной сети. Основными архитектурами, используемыми повсеместно, являются свёрточные нейронные сети (CNN), рекуррентные нейронные сети (RNN), глубокие нейронные сети (DNN). На основе рекуррентных нейронных сетей (RNN) были разработаны сети с долгой краткосрочной памятью (LSTM) и сети с управляемыми реккурентными блоками (GRU). Каждая архитектура нейронной сети имеет свою структуру, свои настраиваемые и обучаемые параметры, обладает своими достоинствами и недостатками. Комбинируя различные виды нейронных сетей, можно существенно улучшить качество предсказания в различных задачах машинного обучения. Учитывая, что выбор оптимальной архитектуры сети и ее параметров является крайне трудной задачей, рассматривается один из методов построения архитектуры нейронных сетей на основе комбинации свёрточных, рекуррентных и глубоких нейронных сетей. Показано, что такие архитектуры превосходят классические алгоритмы машинного обучения The image recognition task is one of the most difficult in machine learning, requiring both deep knowledge and large time and computational resources from the researcher. In the case of using nonlinear and complex data, various architectures of deep neural networks are used but the problem of choosing a neural network remains a difficult issue. The main architectures used everywhere are convolutional neural networks (CNN), recurrent neural networks (RNN), deep neural networks (DNN). Based on recurrent neural networks (RNNs), Long Short Term Memory Networks (LSTMs) and Controlled Recurrent Unit Networks (GRUs) were developed. Each neural network architecture has its own structure, customizable and trainable parameters, and advantages and disadvantages. By combining different types of neural networks, you can significantly improve the quality of prediction in various machine learning problems. Considering that the choice of the optimal network architecture and its parameters is an extremely difficult task, one of the methods for constructing the architecture of neural networks based on a combination of convolutional, recurrent and deep neural networks is considered. We showed that such architectures are superior to classical machine learning algorithms


In recent years, huge amounts of data in form of images has been efficiently created and accumulated at extraordinary rates. This huge amount of data that has high volume and velocity has presented us with the problem of coming up with practical and effective ways to classify it for analysis. Existing classification systems can never fulfil the demand and the difficulties of accurately classifying such data. In this paper, we built a Convolutional Neural Network (CNN) which is one of the most powerful and popular machine learning tools used in image recognition systems for classifying images from one of the widely used image datasets CIFAR-10. This paper also gives a thorough overview of the working of our CNN architecture with its parameters and difficulties.


2017 ◽  
Vol 115 (2) ◽  
pp. 254-259 ◽  
Author(s):  
Daniël M. Pelt ◽  
James A. Sethian

Deep convolutional neural networks have been successfully applied to many image-processing problems in recent works. Popular network architectures often add additional operations and connections to the standard architecture to enable training deeper networks. To achieve accurate results in practice, a large number of trainable parameters are often required. Here, we introduce a network architecture based on using dilated convolutions to capture features at different image scales and densely connecting all feature maps with each other. The resulting architecture is able to achieve accurate results with relatively few parameters and consists of a single set of operations, making it easier to implement, train, and apply in practice, and automatically adapts to different problems. We compare results of the proposed network architecture with popular existing architectures for several segmentation problems, showing that the proposed architecture is able to achieve accurate results with fewer parameters, with a reduced risk of overfitting the training data.


2021 ◽  
Author(s):  
Wai Keen Vong ◽  
Brenden M. Lake

In order to learn the mappings from words to referents, children must integrate co-occurrence information across individually ambiguous pairs of scenes and utterances, a challenge known as cross-situational word learning. In machine learning, recent multimodal neural networks have been shown to learn meaningful visual-linguistic mappings from cross-situational data, as needed to solve problems such as image captioning and visual question answering. These networks are potentially appealing as cognitive models because they can learn from raw visual and linguistic stimuli, something previous cognitive models have not addressed. In this paper, we examine whether recent machine learning approaches can help explain various behavioral phenomena from the psychological literature on cross-situational word learning. We consider two variants of a multimodal neural network architecture, and look at seven different phenomena associated with cross-situational word learning, and word learning more generally. Our results show that these networks can learn word-referent mappings from a single epoch of training, matching the amount of training found in cross-situational word learning experiments. Additionally, these networks capture some, but not all of the phenomena we studied, with all of the failures related to reasoning via mutual exclusivity. These results provide insight into the kinds of phenomena that arise naturally from relatively generic neural network learning algorithms, and which word learning phenomena require additional inductive biases.


Author(s):  
Paweł Tarasiuk ◽  
Piotr S. Szczepaniak

AbstractThis paper presents a novel method for improving the invariance of convolutional neural networks (CNNs) to selected geometric transformations in order to obtain more efficient image classifiers. A common strategy employed to achieve this aim is to train the network using data augmentation. Such a method alone, however, increases the complexity of the neural network model, as any change in the rotation or size of the input image results in the activation of different CNN feature maps. This problem can be resolved by the proposed novel convolutional neural network models with geometric transformations embedded into the network architecture. The evaluation of the proposed CNN model is performed on the image classification task with the use of diverse representative data sets. The CNN models with embedded geometric transformations are compared to those without the transformations, using different data augmentation setups. As the compared approaches use the same amount of memory to store the parameters, the improved classification score means that the proposed architecture is more optimal.


2020 ◽  
Vol 34 (07) ◽  
pp. 11948-11956
Author(s):  
Siddharth Roheda ◽  
Hamid Krim

The importance of inference in Machine Learning (ML) has led to an explosive number of different proposals in ML, and particularly in Deep Learning. In an attempt to reduce the complexity of Convolutional Neural Networks, we propose a Volterra filter-inspired Network architecture. This architecture introduces controlled non-linearities in the form of interactions between the delayed input samples of data. We propose a cascaded implementation of Volterra Filtering so as to significantly reduce the number of parameters required to carry out the same classification task as that of a conventional Neural Network. We demonstrate an efficient parallel implementation of this Volterra Neural Network (VNN), along with its remarkable performance while retaining a relatively simpler and potentially more tractable structure. Furthermore, we show a rather sophisticated adaptation of this network to nonlinearly fuse the RGB (spatial) information and the Optical Flow (temporal) information of a video sequence for action recognition. The proposed approach is evaluated on UCF-101 and HMDB-51 datasets for action recognition, and is shown to outperform state of the art CNN approaches.


Author(s):  
Abhinav N Patil

Image recognition is important side of image processing for machine learning without involving any human support at any step. In this paper we study how image classification is completed using imagery backend. Couple of thousands of images of every, cats and dogs are taken then distributed them into category of test dataset and training dataset for our learning model. The results are obtained using custom neural network with the architecture of Convolution Neural Networks and Keras API.


Author(s):  
Ankith I

Abstract: Recent developments in the field of machine learning have changed the way it operates for ever, especially with the rise of Artificial Neural Networks (ANN). There is no doubt that these biologically inspired computational models are capable of performing far better than previous forms of artificial intelligence in common machine learning tasks as compared to their previous versions. There are several different forms of artificial neural networks (ANNs), but one of the most impressive is the convolutional neural network (CNN). CNN's have been extensively used for solving difficult pattern recognition tasks using images. With their simple yet precise architecture, they offer a simplified approach to getting started with ANNs. The goal of this paper is to provide a brief introduction to CNN. It discusses the latest papers and newly formed techniques in order to develop these absolutely brilliant models of image recognition. This introduction assumes that you already have a basic understanding of ANNs and machine learning. Keywords: Pattern recognition, artificial neural networks, machine learning, image analysing.


2022 ◽  
Author(s):  
Sinem Sav ◽  
Jean-Philippe Bossuat ◽  
Juan R. Troncoso-Pastoriza ◽  
Manfred Claassen ◽  
Jean-Pierre Hubaux

Training accurate and robust machine learning models requires a large amount of data that is usually scattered across data-silos. Sharing or centralizing the data of different healthcare institutions is, however, unfeasible or prohibitively difficult due to privacy regulations. In this work, we address this problem by using a novel privacy-preserving federated learning-based approach, PriCell, for complex machine learning models such as convolutional neural networks. PriCell relies on multiparty homomorphic encryption and enables the collaborative training of encrypted neural networks with multiple healthcare institutions. We preserve the confidentiality of each institutions' input data, of any intermediate values, and of the trained model parameters. We efficiently replicate the training of a published state-of-the-art convolutional neural network architecture in a decentralized and privacy-preserving manner. Our solution achieves an accuracy comparable to the one obtained with the centralized solution, with an improvement of at least one-order-of-magnitude in execution time with respect to prior secure solutions. Our work guarantees patient privacy and ensures data utility for efficient multi-center studies involving complex healthcare data.


Sign in / Sign up

Export Citation Format

Share Document